HyperAI
HyperAI초신경
홈
뉴스
연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
전체 사이트 검색...
⌘
K
홈
SOTA
텍스트-투-스피치 합성
Text To Speech Synthesis On Ljspeech
Text To Speech Synthesis On Ljspeech
평가 지표
Audio Quality MOS
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
Audio Quality MOS
Paper Title
Repository
FastSpeech (Mel + WaveGlow)
3.84
FastSpeech: Fast, Robust and Controllable Text to Speech
FastDiff-TTS
4.03
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
FastSpeech 2 + HiFiGAN
4.34
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Grad-TTS + HiFiGAN (1000 steps)
4.37
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech
Flowtron
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Transformer TTS (Mel + WaveGlow)
3.88
Neural Speech Synthesis with Transformer Network
VITS
4.43
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Matcha-TTS
-
Matcha-TTS: A fast TTS architecture with conditional flow matching
Tacotron 2
-
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
FastSpeech 2 + HiFiGAN
4.32
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
temp
1.25
-
-
FastDiff (4 steps)
4.28
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
OverFlow
3.37
OverFlow: Putting flows on top of neural transducers for better TTS
NaturalSpeech
4.56
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Merlin
2.4
FastSpeech: Fast, Robust and Controllable Text to Speech
Glow-TTS + HiFiGAN
4.34
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
0 of 16 row(s) selected.
Previous
Next