HyperAI
Accueil
Actualités
Articles de recherche récents
Tutoriels
Ensembles de données
Wiki
SOTA
Modèles LLM
Classement GPU
Événements
Recherche
À propos
Français
HyperAI
Toggle sidebar
Rechercher sur le site...
⌘
K
Accueil
SOTA
Text To Speech Synthesis
Text To Speech Synthesis On Ljspeech
Text To Speech Synthesis On Ljspeech
Métriques
Audio Quality MOS
Résultats
Résultats de performance de divers modèles sur ce benchmark
Columns
Nom du modèle
Audio Quality MOS
Paper Title
Repository
FastSpeech (Mel + WaveGlow)
3.84
FastSpeech: Fast, Robust and Controllable Text to Speech
FastDiff-TTS
4.03
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
FastSpeech 2 + HiFiGAN
4.34
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Grad-TTS + HiFiGAN (1000 steps)
4.37
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
-
Flowtron
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Transformer TTS (Mel + WaveGlow)
3.88
Neural Speech Synthesis with Transformer Network
VITS
4.43
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Matcha-TTS
-
Matcha-TTS: A fast TTS architecture with conditional flow matching
Tacotron 2
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
FastSpeech 2 + HiFiGAN
4.32
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
temp
1.25
-
-
FastDiff (4 steps)
4.28
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
OverFlow
3.37
OverFlow: Putting flows on top of neural transducers for better TTS
NaturalSpeech
4.56
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Merlin
2.4
FastSpeech: Fast, Robust and Controllable Text to Speech
Glow-TTS + HiFiGAN
4.34
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
0 of 16 row(s) selected.
Previous
Next