HyperAI
Startseite
Neuigkeiten
Neueste Forschungsarbeiten
Tutorials
Datensätze
Wiki
SOTA
LLM-Modelle
GPU-Rangliste
Veranstaltungen
Suche
Über
Deutsch
HyperAI
Toggle sidebar
Seite durchsuchen…
⌘
K
Startseite
SOTA
Text To Speech Synthesis
Text To Speech Synthesis On Ljspeech
Text To Speech Synthesis On Ljspeech
Metriken
Audio Quality MOS
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Columns
Modellname
Audio Quality MOS
Paper Title
Repository
FastSpeech (Mel + WaveGlow)
3.84
FastSpeech: Fast, Robust and Controllable Text to Speech
FastDiff-TTS
4.03
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
FastSpeech 2 + HiFiGAN
4.34
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Grad-TTS + HiFiGAN (1000 steps)
4.37
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
-
Flowtron
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Transformer TTS (Mel + WaveGlow)
3.88
Neural Speech Synthesis with Transformer Network
VITS
4.43
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Matcha-TTS
-
Matcha-TTS: A fast TTS architecture with conditional flow matching
Tacotron 2
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
FastSpeech 2 + HiFiGAN
4.32
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
temp
1.25
-
-
FastDiff (4 steps)
4.28
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
OverFlow
3.37
OverFlow: Putting flows on top of neural transducers for better TTS
NaturalSpeech
4.56
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Merlin
2.4
FastSpeech: Fast, Robust and Controllable Text to Speech
Glow-TTS + HiFiGAN
4.34
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
0 of 16 row(s) selected.
Previous
Next