Speech Recognition On Librispeech Test Other

Métriques

Word Error Rate (WER)

Résultats

Résultats de performance de divers modèles sur ce benchmark

		Paper Title	Repository
Local Prior Matching (Large Model)	20.84	Semi-Supervised Speech Recognition via Local Prior Matching
Snips	16.5	Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
Local Prior Matching (Large Model, ConvLM LM)	15.28	Semi-Supervised Speech Recognition via Local Prior Matching
Deep Speech 2	13.25	Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
TDNN + pNorm + speed up/down speech	12.5	-	-
CTC-CRF 4gram-LM	10.65	CRF-based Single-stage Acoustic Modeling with CTC Topology	-
Convolutional Speech Recognition	10.47	Fully Convolutional Speech Recognition	-
MT4SSL	9.6	MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Jasper DR 10x5	8.79	Jasper: An End-to-End Convolutional Neural Acoustic Model
Espresso	8.7	Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Jasper DR 10x5 (+ Time/Freq Masks)	7.84	Jasper: An End-to-End Convolutional Neural Acoustic Model
tdnn + chain + rnnlm rescoring	7.63	Neural Network Language Modeling with Letter-based Features and Importance Sampling	-
QuartzNet15x5	7.25	QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions
Conformer with Relaxed Attention	6.85	Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
LAS (no LM)	6.5	SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Squeezeformer (L)	5.97	Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
LAS + SpecAugment	5.8	SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Multi-Stream Self-Attention With Dilated 1D Convolutions	5.80	State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions
Transformer	5.7	A Comparative Study on Transformer vs RNN in Speech Applications
LSTM Transducer	5.6	Librispeech Transducer Model with Internal Language Model Prior Correction

0 of 53 row(s) selected.

Command Palette

Speech Recognition On Librispeech Test Other

Métriques

Résultats