HyperAI
الرئيسية
الأخبار
أحدث الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
العربية
HyperAI
Toggle sidebar
البحث في الموقع...
⌘
K
الرئيسية
SOTA
Text To Speech Synthesis
Text To Speech Synthesis On Ljspeech
Text To Speech Synthesis On Ljspeech
المقاييس
Audio Quality MOS
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
Audio Quality MOS
Paper Title
Repository
FastSpeech (Mel + WaveGlow)
3.84
FastSpeech: Fast, Robust and Controllable Text to Speech
FastDiff-TTS
4.03
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
FastSpeech 2 + HiFiGAN
4.34
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Grad-TTS + HiFiGAN (1000 steps)
4.37
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
-
Flowtron
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Transformer TTS (Mel + WaveGlow)
3.88
Neural Speech Synthesis with Transformer Network
VITS
4.43
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Matcha-TTS
-
Matcha-TTS: A fast TTS architecture with conditional flow matching
Tacotron 2
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
FastSpeech 2 + HiFiGAN
4.32
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
temp
1.25
-
-
FastDiff (4 steps)
4.28
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
OverFlow
3.37
OverFlow: Putting flows on top of neural transducers for better TTS
NaturalSpeech
4.56
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Merlin
2.4
FastSpeech: Fast, Robust and Controllable Text to Speech
Glow-TTS + HiFiGAN
4.34
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
0 of 16 row(s) selected.
Previous
Next