NCIFD National Culture Fine-tuning Dataset
Date
7 months ago
Size
1.43 MB
Publish URL
Categories
NCIFD (National Culture Instruction-Following Dataset) is a national culture fine-tuning dataset for large models constructed by the National Language Resource Monitoring and Research Center for Minority Languages of Minzu University of China. It contains 151,159 data items, of which 10,000 are publicly available, covering seven major areas: architecture, clothing, crafts, food, etiquette, language, and customs.
The dataset mainly consists of two parts:
- NCSI (National Culture Self-Instruct):
- Through the Self-Instruct framework, a large language model is used to generate a dataset, and the generated data is screened for quality.
- NCQA (National Culture Self-QA):
- Through the Self-QA framework, a large language model is used to generate QA pairs, and the generated QA pairs are screened for quality to ensure the clarity of the questions and the completeness, accuracy, and clarity of the answers.
NCIFD.torrent
Seeding 1Downloading 0Completed 62Total Downloads 136