HyperAI

CPED Chinese Conversation Dataset

Date

3 years ago

Organization

IEEE

Publish URL

github.com

License

其他

Download Help
特色图像

CPED stands for Chinese Personalized and Emotional Dialogue, and is the first large-scale Chinese personalized and emotional dialogue dataset. The dataset consists of multi-source knowledge related to empathy and personal characteristics (covering knowledge such as gender, five personality traits, 13 emotions, 19 dialogue behaviors, and 10 scenarios).

The dataset contains:

  • 133000 Multimodal Contextual Discourse
  • More than 12,000 conversations from 392 speakers across 40 TV shows
  • 3 character attributes (name, gender, age) annotation, five personality traits annotation, 2 dynamic emotional information (emotion and emotion) annotation and DA annotation
  • Three tasks: Personality Recognition in Conversation (PRC), Emotion Recognition in Conversation (ERC), and Personalized and Emotional Conversation (PEC)