HyperAI超神経
ホーム
ニュース
最新論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
日本語
HyperAI超神経
Toggle sidebar
サイトを検索…
⌘
K
ホーム
SOTA
Text To Speech Synthesis
Text To Speech Synthesis On Ljspeech
Text To Speech Synthesis On Ljspeech
評価指標
Audio Quality MOS
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
Audio Quality MOS
Paper Title
Repository
FastSpeech (Mel + WaveGlow)
3.84
FastSpeech: Fast, Robust and Controllable Text to Speech
FastDiff-TTS
4.03
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
FastSpeech 2 + HiFiGAN
4.34
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Grad-TTS + HiFiGAN (1000 steps)
4.37
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
-
Flowtron
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Transformer TTS (Mel + WaveGlow)
3.88
Neural Speech Synthesis with Transformer Network
VITS
4.43
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Matcha-TTS
-
Matcha-TTS: A fast TTS architecture with conditional flow matching
Tacotron 2
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
FastSpeech 2 + HiFiGAN
4.32
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
temp
1.25
-
-
FastDiff (4 steps)
4.28
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
OverFlow
3.37
OverFlow: Putting flows on top of neural transducers for better TTS
NaturalSpeech
4.56
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Merlin
2.4
FastSpeech: Fast, Robust and Controllable Text to Speech
Glow-TTS + HiFiGAN
4.34
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
0 of 16 row(s) selected.
Previous
Next