CPED Chinese Conversation Dataset
Date
3 years ago
Publish URL
License
其他
Categories

CPED stands for Chinese Personalized and Emotional Dialogue, and is the first large-scale Chinese personalized and emotional dialogue dataset. The dataset consists of multi-source knowledge related to empathy and personal characteristics (covering knowledge such as gender, five personality traits, 13 emotions, 19 dialogue behaviors, and 10 scenarios).
The dataset contains:
- 133000 Multimodal Contextual Discourse
- More than 12,000 conversations from 392 speakers across 40 TV shows
- 3 character attributes (name, gender, age) annotation, five personality traits annotation, 2 dynamic emotional information (emotion and emotion) annotation and DA annotation
- Three tasks: Personality Recognition in Conversation (PRC), Emotion Recognition in Conversation (ERC), and Personalized and Emotional Conversation (PEC)