Computational Linguistics
Computational linguistics is a discipline that uses mathematical models to analyze and process natural languages, and uses programs on computers to implement the analysis and processing process, thereby achieving the goal of using machines to simulate part or all of a person's language abilities.
Basic content
Computational linguistics can be divided into the following three categories according to the nature and complexity of its work:
- Automatic editing: This is what computers are best at, and it is also the most mature part of computational linguistics. It counts, classifies, and sorts various language materials, edits various word lists, indexes, and dictionaries, and builds corpora and terminology databases.
- Automatic analysis: This is a more complex automatic language processing. This automatic analysis system works based on specific language information stored in the computer in advance, with the aim of obtaining a predetermined conclusion.
- Automatic research: This is a more complex automatic language processing. This automatic research system works based on the general language information stored in the computer, and uses statistics, comparison, analogy and other means to draw its own inferences.
application
The core of computational linguistics is the automatic understanding and generation of language. The former identifies the syntactic structure of a sentence from the word symbol string on the surface of the sentence, determines the semantic relationship between the components, and ultimately figures out the meaning of the sentence; the latter selects words based on the meaning to be expressed, constructs the semantic and syntactic structures between the components based on the semantic relationship between the words, and ultimately creates sentences that conform to grammar and logic.
Computational linguistics is divided into two levels: scientific research and technological research. The purpose of scientific research is to discover the inherent laws of language, explore computational methods for language understanding and generation, and build basic resources for language information processing. Technological research is driven by application goals and designs and develops practical language information processing systems based on the actual needs of society.