HyperAI

BookCorpus Large Book Text Dataset

Date

5 years ago

Size

2.24 GB

Publish URL

github.com

License

非商业用途

BookCorpus is a once popular large text corpus that is often used for unsupervised learning of sentence encoding/decoding. However, the original author no longer provides downloads of BookCorpus.

Currently, most of the data sources of this BookCorpus dataset come from free books on smashwords.com, which is almost the same as the original BookCorpus

BookCorpus.torrent
Seeding 1Downloading 0Completed 1,589Total Downloads 3,346
  • BookCorpus/
    • .DS_Store
      8 KB
    • README.md
      8.99 KB
    • README.txt
      9.98 KB
    • books1.tar.gz
      2.24 GB