HyperAI
HyperAI
Home
News
Latest Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
English
HyperAI
HyperAI
Toggle sidebar
Search the site…
⌘
K
Home
SOTA
Text-To-Speech Synthesis
Text To Speech Synthesis On Ljspeech
Text To Speech Synthesis On Ljspeech
Metrics
Audio Quality MOS
Results
Performance results of various models on this benchmark
Columns
Model Name
Audio Quality MOS
Paper Title
Repository
FastSpeech (Mel + WaveGlow)
3.84
FastSpeech: Fast, Robust and Controllable Text to Speech
-
FastDiff-TTS
4.03
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
-
FastSpeech 2 + HiFiGAN
4.34
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
-
Grad-TTS + HiFiGAN (1000 steps)
4.37
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
-
Flowtron
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
-
Transformer TTS (Mel + WaveGlow)
3.88
Neural Speech Synthesis with Transformer Network
-
VITS
4.43
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
-
Matcha-TTS
-
Matcha-TTS: A fast TTS architecture with conditional flow matching
-
Tacotron 2
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
-
FastSpeech 2 + HiFiGAN
4.32
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
-
temp
1.25
-
-
FastDiff (4 steps)
4.28
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
-
OverFlow
3.37
OverFlow: Putting flows on top of neural transducers for better TTS
-
NaturalSpeech
4.56
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
-
Merlin
2.4
FastSpeech: Fast, Robust and Controllable Text to Speech
-
Glow-TTS + HiFiGAN
4.34
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
-
0 of 16 row(s) selected.
Previous
Next
Text To Speech Synthesis On Ljspeech | SOTA | HyperAI