HyperAI

THUCNews News Dataset

Date

2 years ago

Size

1.45 GB

Organization

Tsinghua University

License

其他

The THUCNews dataset is generated by filtering the historical data of Sina News from 2005 to 2011, and contains 740,000 news documents, all in UTF-8 plain text format. Based on the original Sina News classification system, this dataset is re-integrated into 14 candidate classification categories: finance, lottery, real estate, stocks, home, education, technology, society, fashion, current affairs, sports, constellations, games, and entertainment.

THUCNews.torrent
Seeding 2Downloading 0Completed 955Total Downloads 2,731
  • THUCNews/
    • README.md
      1.01 KB
    • README.txt
      2.01 KB
      • data/
        • THUCNews.zip
          1.45 GB