The process of computer analysis of input provided in a human language (natural language), and conversion of this input into a useful form of representation.
The field of NLP is primarily concerned with getting computers to perform useful and interesting tasks with human languages.
It is secondarily concerned with helping us come to a better understanding of human language.

Components of NLP

Mapping the given input in the natural language into a useful representation.
Different level of analysis required:

  • morphological analysis,
  • syntactic analysis,
  • semantic analysis,
  • discourse analysis

Find more such posts:
Leveraging AI to Enhance Your Website
Exploring Ubuntu: Features and Benefits of This Popular Linux Distribution

Natural Language Generation

– Producing output in the natural language from some internal representation.
– Different level of synthesis required:

  • deep planning (what to say),
  • syntactic generation

Knowledge of Language

Phonology – concerns how words are related to the sounds that realize them.
Morphology – concerns how words are constructed from more basic meaning units called morphemes. A morpheme is the primitive unit of meaning in a language.
Syntax – concerns how can be put together to form correct sentences and determines what structural role each word plays in the sentence and what phrases are subparts of other phrases.
Semantics – concerns what words mean and how these meaning combine in sentences to form sentence meaning. The study of context-independent meaning.
Pragmatics – concerns how sentences are used in different situations and how use affects the interpretation of the sentence.
Discourse – concerns how the immediately preceding sentences affect the interpretation of the next sentence. For example, interpreting pronouns and interpreting the temporal aspects of the information.
World Knowledge – includes general knowledge about the world. What each language user must know about the other’s beliefs and goals.

Models to Represent Linguistic Knowledge

We will use certain formalisms (models) to represent the required linguistic knowledge.
State Machines -- FSAs, FSTs, HMMs, ATNs, RTNs
Formal Rule Systems -- Context Free Grammars, Unification Grammars, Probabilistic CFGs.
Logic-based Formalisms -- first order predicate logic, some higher order logic.
Models of Uncertainty -- Bayesian probability theory

Algorithms to Manipulate Linguistic Knowledge

We will use algorithms to manipulate the models of linguistic knowledge to produce the desired behavior.
Most of the algorithms we will study are transducers and parsers. These algorithms construct some structure based on their input.
Since the language is ambiguous at all levels, these algorithms are never simple processes.
Categories of most algorithms that will be used can fall into following categories.

  • state space search
  • dynamic programming

Conclusion

NLP might empower computers to process the human language, which essentially embraces complex tasks-such as converting input in natural language into an exploitable form-and generates meaningful responses based upon that input. Multiple levels of analysis drive the core of NLP: morphological, syntactic, semantic, discourse, and pragmatic analysis. It propels computational tasks on board with human language further even providing an improved understanding of the language. As algorithms and models applied to NLP advance, so does the gap narrow between human to machine understanding, from human communication.

Happy Processing!


Comments(0)