FastSpeech (Mel + WaveGlow) | 3.84 | FastSpeech: Fast, Robust and Controllable Text to Speech | |
Grad-TTS + HiFiGAN (1000 steps) | 4.37 | Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech | - |
Transformer TTS (Mel + WaveGlow) | 3.88 | Neural Speech Synthesis with Transformer Network | |