HyperAI

CATSLU Chinese Audio Text Spoken Language Understanding Dataset

CATSLU is a Chinese speech + NLU text understanding dialogue dataset, which conducts experiments from speech signal to understanding end to end, such as modeling language understanding directly from phonemes (rather than words or tokens). This dataset comes from the first Chinese speech-to-text spoken language understanding challenge, including test datasets and results, training and validation datasets, baselines, and manuals.

Dataset details, baseline results, andhere.

CATSLU.torrent
Seeding 1Downloading 1Completed 122Total Downloads 102
  • CATSLU/
    • README.md
      1.13 KB
    • README.txt
      2.26 KB
      • data/
        • CATSLU.zip
          1.13 GB