The study of
natural language, as done by computer programs. Typically, this means a program trying to understand
English (or
French, or any other natural language) typed or spoken by a person. This field is also known as Natural Language Processing (NLP).
There are several problems in getting programs to understand natural language. Many of these are due to the ambiguity in the language:
- word boundary detection: in spoken language, there are no gaps between words; where to place the word boundary often depends on what choice makes the most sense gramatically and given the context.
- word sense disambiguation: the same word can have several different meanings; we have to select the meaning which makes the most sense in context
- syntactic ambiguity: the grammar for natural languages is not unambiguous, i.e. there are often multiple possible parse trees for a given sentence. choosing the correct one requires semantic information
- speech acts and plans: sentences often don't mean what they literally mean; for instance the correct answer to "can you pass the salt?" is to pass the salt, not say "yes". or again, if a class was not offered last year, the correct answer to the question "how many students failed the class last year?" is "the class was not offered last year", not "none".
/Talk
Another version of this article:
Computational Linguistics (CL) is a sub-discipline of both
Artificial Intelligence (AI) and linguistics that
applies computational methods to the scientific study of the human language.
A very related term that emphasizes the engineering aspect
of CL is natural language processing.
The
[Association for Computational Linguistics definition]:
- computational linguistics is the scientific study of language from a computational perspective. Computational linguists are interested in providing computational models of various kinds of linguistic phenomena.