HyperAI

NLPCC2016 News Dataset

Date

2 years ago

Size

18.29 MB

Organization

Fudan University

Publish URL

github.com

License

其他

The NLPCC2016 dataset is different from the popular news dataset and uses more informal text from Sina Weibo. The training and test data consists of microblogs from different topics, such as finance, sports, entertainment, etc. This dataset is utf-8 encoded and can be used for Chinese word segmentation tasks.

NLPCC2016.torrent
Seeding 2Downloading 0Completed 974Total Downloads 2,218
  • NLPCC2016/
    • README.md
      928 字节
    • README.txt
      1.81 KB
      • data/
        • master.zip
          18.29 MB