HyperAI
HyperAI超神経
ホーム
ニュース
最新論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
日本語
HyperAI
HyperAI超神経
Toggle sidebar
サイトを検索…
⌘
K
ホーム
SOTA
テキストツースピーチ合成
Text To Speech Synthesis On Ljspeech
Text To Speech Synthesis On Ljspeech
評価指標
Audio Quality MOS
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
Audio Quality MOS
Paper Title
Repository
FastSpeech (Mel + WaveGlow)
3.84
FastSpeech: Fast, Robust and Controllable Text to Speech
-
FastDiff-TTS
4.03
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
-
FastSpeech 2 + HiFiGAN
4.34
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
-
Grad-TTS + HiFiGAN (1000 steps)
4.37
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
-
Flowtron
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
-
Transformer TTS (Mel + WaveGlow)
3.88
Neural Speech Synthesis with Transformer Network
-
VITS
4.43
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
-
Matcha-TTS
-
Matcha-TTS: A fast TTS architecture with conditional flow matching
-
Tacotron 2
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
-
FastSpeech 2 + HiFiGAN
4.32
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
-
temp
1.25
-
-
FastDiff (4 steps)
4.28
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
-
OverFlow
3.37
OverFlow: Putting flows on top of neural transducers for better TTS
-
NaturalSpeech
4.56
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
-
Merlin
2.4
FastSpeech: Fast, Robust and Controllable Text to Speech
-
Glow-TTS + HiFiGAN
4.34
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
-
0 of 16 row(s) selected.
Previous
Next