HyperAI
Download Help

The CHiME dataset is derived from speech clips from the CHiME challenge and is mainly used for speech recognition.

The CHiME challenge is an automatic speech recognition evaluation system for long-range multi-microphone speech recognition in everyday environments.

The dataset includes audio recorded binaurally with a static speaker in a home environment for small vocabulary ASR tasks, audio recorded binaurally with a mobile speaker in a home environment for medium vocabulary ASR tasks, audio recorded using 1, 2, and 6 channel tablet devices in many indoor and outdoor urban environments, and remote microphone multi-party conversations in a home environment.

The CHiME challenge started in 2010, with Jon Barker, Shinji Watanabe and Emmanuel Vincent as the main initiators.

The relevant papers are as follows:

"The CHiME corpus: a resource and a challenge for Computational Hearing in Multisoure Environments"

"The PASCAL CHiME speech separation and recognition challenge"

"The second CHiME speech separation and recognition challenge: Datasets, tasks and baselines"

"The second CHiME speech separation and recognition challenge: an overview of challenge systems and outcomes"

"The third CHIME speech separation and recognition challenge: analysis and outcomes"

"An analysis of environment, microphone and data simulation mismatches in robust speech recognition"

"The fifth `CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines"