Command Palette
Search for a command to run...
DiaMoE-TTS Multi-Dialect Speech Phonetic Dataset
DiaMoE-TTS is a speech dataset for multi-dialect text-to-speech (TTS) tasks, released in 2025 by Tsinghua University in collaboration with Giant Interactive. The related research paper is titled "...".DiaMoE-TTS: A Unified IPA-Based Dialect TTS Framework with Mixture-of-Experts and Parameter-Efficient Zero-Shot AdaptationThe goal is to build a unified dialect phonetic representation system to support transferable speech modeling and zero-shot dialect synthesis research across multiple dialects.
This dataset is built upon multiple open-source dialect speech resources and employs IPA (International Phonetic Alphabet) as a unified phonetic representation system for consistent phonological annotation across different dialect corpora. The speech sources include the Common Voice Cantonese dataset, the Emilia Mandarin corpus, dialect speech from the KeSpeech corpus, and the open-source Minnan (Hokkien) speech dataset. During data processing, all speech samples underwent a unified phoneme-level phonetic conversion, constructing an IPA front-end annotation sequence that can be aligned across dialects.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.