AISHELL-1 Open Source Chinese Speech Database
Date
Size
Publish URL
Categories
The Hillshell Chinese Mandarin Open Source Speech Database AISHELL-ASR0009-OS1 has a recording time of 178 hours and is part of the Hillshell Chinese Mandarin Speech Database AISHELL-ASR0009.
The AISHELL-ASR0009 recording text involves 11 fields such as smart home, unmanned driving, and industrial production. The recording process was conducted in a quiet indoor environment, using 3 different devices at the same time: a high-fidelity microphone (44.1kHz, 16-bit); an Android phone (16kHz, 16-bit); and an iOS phone (16kHz, 16-bit). The audio recorded by the high-fidelity microphone was downsampled to 16kHz and used to produce AISHELL-ASR0009-OS1. 400 speakers from different accent areas in China participated in the recording. After being transcribed and annotated by professional voice proofreaders and passing strict quality inspections, the text accuracy of this database is above 95%. It is divided into training set, development set, and test set.