HyperAIHyperAI

PKU Simplified Chinese Word Segmentation Dataset

Date

2 years ago

Size

3.54 MB

Organization

Peking University

SIGHAN 2005 The International Chinese Automatic Word Segmentation Evaluation (SIGHAN Evaluation) integrates word segmentation datasets from multiple institutions. This dataset was jointly released by Microsoft Research China, Peking University, City University of Hong Kong, and Academia Sinica in Taiwan, and is used for training and evaluating Chinese word segmentation models. PKU is a simplified Chinese word segmentation dataset.

中文分词pku.torrent
Seeding 1Downloading 0Completed 201Total Downloads 535
  • 中文分词pku/
    • README.md
      1.06 KB
    • README.txt
      2.12 KB
      • data/
        • chinese_word_pku.zip
          3.54 MB
PKU Simplified Chinese Word Segmentation Dataset | Datasets | HyperAI