HyperAIHyperAI

CSEMOTIONS Emotional Audio Dataset

Download Help

CSEMOTIONS is an emotional audio dataset released by Alibaba in 2025. The related paper results are "Marco-Voice Technical Report"Aims to support research in the areas of controllability and natural language speech generation.

This dataset contains approximately 10 hours of high-quality audio data, covering audio from 10 professional voice actors (5 male and 5 female) in seven emotion categories: calm, happy, angry, sad, surprised, disgusted, and fear. Each emotion contains 500-700 recordings of Chinese text.