SeniorTalk Chinese Speech Dataset for Elderly People's Conversations
The SeniorTalk dataset is the world's first Chinese super-elderly conversation speech dataset released by Nankai University and Beijing Zhiyuan Artificial Intelligence Research Institute in March 2025. The related paper results are:SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged SeniorsThe dataset contains multi-dimensional fine annotations, including speaker information, dialogue content transcription, timestamps (including sentence and word levels), accent category labels, etc. These real-world data will provide valuable support for in-depth research on elderly voice signals and optimization of elderly voice interaction systems, and promote the development of related industries such as equipment adaptation for the elderly, health management, and auxiliary elderly care robots.
Key Features:
- The data size is large: 202 bits, 55.53 hours of voice data of very elderly people.
- Wide geographical coverage: Data is collected from 16 provinces and cities, covering different regional accents.
- Natural and real interaction: It adopts spontaneous dialogue between two people, covering topics such as retirement, health, and life, which is close to real communication scenarios.

