PKU Simplified Chinese Word Segmentation Dataset
Date
2 years ago
Size
3.54 MB
Publish URL
Paper URL
SIGHAN 2005 The International Chinese Automatic Word Segmentation Evaluation (SIGHAN Evaluation) integrates word segmentation datasets from multiple institutions. This dataset was jointly released by Microsoft Research China, Peking University, City University of Hong Kong, and Academia Sinica in Taiwan, and is used for training and evaluating Chinese word segmentation models. PKU is a simplified Chinese word segmentation dataset.
中文分词pku.torrent
Seeding 1Downloading 0Completed 233Total Downloads 586