HyperAIHyperAI

icwb2-data Chinese Word Segmentation Dataset

The icwb2-data dataset is jointly released by Peking University, City University of Hong Kong, CKIP in Taiwan, Academia Sinica and Microsoft Research China, and is used to train Chinese word segmentation models. AS and CityU are traditional Chinese datasets, and PK and MSR are simplified Chinese datasets.

icwb2-data.torrent
Seeding 2Downloading 0Completed 1,176Total Downloads 2,409
  • icwb2-data/
    • README.md
      939 字节
    • README.txt
      1.83 KB
      • data/
        • icwb2-data.zip
          50.2 MB