HyperAI

AISHELL-1 Open Source Chinese Speech Database

Date

a year ago

Size

14.52 GB

Organization

The Hillshell Chinese Mandarin Open Source Speech Database AISHELL-ASR0009-OS1 has a recording time of 178 hours and is part of the Hillshell Chinese Mandarin Speech Database AISHELL-ASR0009.

The AISHELL-ASR0009 recording text involves 11 fields such as smart home, unmanned driving, and industrial production. The recording process was conducted in a quiet indoor environment, using 3 different devices at the same time: a high-fidelity microphone (44.1kHz, 16-bit); an Android phone (16kHz, 16-bit); and an iOS phone (16kHz, 16-bit). The audio recorded by the high-fidelity microphone was downsampled to 16kHz and used to produce AISHELL-ASR0009-OS1. 400 speakers from different accent areas in China participated in the recording. After being transcribed and annotated by professional voice proofreaders and passing strict quality inspections, the text accuracy of this database is above 95%. It is divided into training set, development set, and test set.

AISHELL-1.torrent
Seeding 1Downloading 1Completed 156Total Downloads 496
  • AISHELL-1/
    • README.md
      1.5 KB
    • README.txt
      3 KB
      • data/
        • AISHELL-1.zip
          14.52 GB