HyperAI

Speech Recognition On Wsj Eval92

Métriques

Word Error Rate (WER)

Résultats

Résultats de performance de divers modèles sur ce benchmark

Nom du modèle
Word Error Rate (WER)
Paper TitleRepository
End-to-end LF-MMI3.0End-to-end speech recognition using lattice-free MMI-
Espresso3.4Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
ConformerXXL-P1.3BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition-
CNN over RAW speech (wav)5.6--
tdnn + chain2.32Purely sequence-trained neural networks for ASR based on lattice-free MMI-
CTC-CRF ST-NAS2.77Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients-
Jasper 10x36.9Jasper: An End-to-End Convolutional Neural Acoustic Model
TC-DNN-BLSTM-DNN3.5Deep Recurrent Neural Networks for Acoustic Modelling-
Speechstew 100M1.3SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network-
test-set on open vocabulary (i.e. harder), model = HMM-DNN + pNorm*3.6--
Convolutional Speech Recognition3.5Fully Convolutional Speech Recognition-
CTC-CRF VGG-BLSTM3.2CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
CTC-CRF 4gram-LM3.79CRF-based Single-stage Acoustic Modeling with CTC Topology
Transformer with Relaxed Attention3.19Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
Deep Speech 23.60Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
Task activating prompting generative correction2.11Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting-
RobustGER2.2It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
0 of 17 row(s) selected.