HyperAI

Treebank

TreebankIt is a deep processing corpus that can be used to segment sentences, tag parts of speech, and annotate syntactic structure relationships.

Classification of Treebank

Treebanks can be roughly divided into two categories: phrase structure tree banks and dependency structure tree banks.

  • Phrase structure tree library: generally describes sentences using their structural components;
  • Dependency structure tree library: built according to the dependency structure of sentences.

The role of treebanks

  • Provide data and platform for automatic parsers;
  • Provide real text annotation materials for syntactic research;
  • The basis for labeling semantic items and semantic relations of words within sentences.

References

【1】Wang Yuelong, Ji Donghong. A review of Chinese treebanks[J]. Contemporary Linguistics, 2009(1):47-55.