HyperAI

Speech Synthesis On Libritts

Métriques

M-STFT
MCD
PESQ
Periodicity
V/UV F1

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle
M-STFT
MCD
PESQ
Periodicity
V/UV F1
Paper TitleRepository
BigVSAN0.78810.33814.1160.09350.9635BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
BigVGAN-v20.70260.29034.3620.05930.9793BigVGAN: A Universal Neural Vocoder with Large-Scale Training
PeriodWave + FreeU1.0269-4.2480.07650.9651PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
RFWave--4.2280.0900.968RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
WaveGlow1.30992.35913.1380.14850.9378WaveGlow: A Flow-based Generative Network for Speech Synthesis
Vocos--3.700.1010.9582Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
WaveFlow1.11201.24553.0270.14160.9410WaveFlow: A Compact Flow-based Model for Raw Audio
BigVSAN (w/ snakebeta)0.79920.41294.1200.09240.9644BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
BigVGAN0.79970.37454.0270.10180.9598BigVGAN: A Universal Neural Vocoder with Large-Scale Training
SC-WaveRNN2.23581.88541.7010.30440.8144Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions
PeriodWave-Turbo-L0.7358-4.4540.05280.9756Accelerating High-Fidelity Waveform Generation via Adversarial Flow Matching Optimization
EVA-GAN-big0.7982-4.35360.07510.9745EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
BigVGAN-base0.87880.45643.5190.12870.9459BigVGAN: A Universal Neural Vocoder with Large-Scale Training
HiFi-GAN1.00170.66032.947 0.15650.9300HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
EVA-GAN-base0.9485-4.03300.09420.9658EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
0 of 15 row(s) selected.