Speech Recognition On Librispeech Test Clean
Métriques
Word Error Rate (WER)
Résultats
Résultats de performance de divers modèles sur ce benchmark
Tableau comparatif
Nom du modèle | Word Error Rate (WER) |
---|---|
specaugment-a-simple-data-augmentation-method | 2.7 |
fast-simpler-and-more-accurate-hybrid-asr | 2.10 |
multi-head-state-space-model-for-speech | 1.76 |
conformer-convolution-augmented-transformer | 1.9 |
rwth-asr-systems-for-librispeech-hybrid-vs | 2.3 |
iterative-pseudo-labeling-for-speech | 2.10 |
cr-ctc-consistency-regularization-on-ctc-for | 2.02 |
improved-training-of-end-to-end-attention | 3.82 |
neural-network-language-modeling-with-letter | 3.06 |
transformer-based-asr-incorporating-time | 1.9 |
the-pytorch-kaldi-speech-recognition-toolkit | 6.2 |
a-comparative-study-on-transformer-vs-rnn-in | 2.6 |
snips-voice-platform-an-embedded-spoken | 6.4 |
w2v-bert-combining-contrastive-learning-and | 1.4 |
cr-ctc-consistency-regularization-on-ctc-for | 1.88 |
end-to-end-asr-from-supervised-to-semi | 2.31 |
state-of-the-art-speech-recognition-using | 2.20 |
zipformer-a-faster-and-better-encoder-for | 2.00 |
high-precision-medical-speech-recognition | 0.985 |
contextnet-improving-convolutional-neural | 2 |
crf-based-single-stage-acoustic-modeling-with | 4.09 |
librispeech-transducer-model-with-internal | 2.23 |
hubert-self-supervised-speech-representation | 1.8 |
fast-conformer-with-linearly-scalable | 1.46 |
graph-convolutions-enrich-the-self-attention | 2.11 |
qwen-audio-advancing-universal-audio | 2.0 |
wavlm-large-scale-self-supervised-pre | 1.8 |
jasper-an-end-to-end-convolutional-neural | 2.95 |
Modèle 29 | 8.0 |
contextnet-improving-convolutional-neural | 1.9 |
squeezeformer-an-efficient-transformer-for | 2.47 |
fadam-adam-is-a-natural-gradient-optimizer | 1.34 |
self-training-and-pre-training-are | 2.7 |
speechstew-simply-mix-all-available-speech | 1.7 |
letter-based-speech-recognition-with-gated | 4.8 |
self-training-and-pre-training-are | 1.5 |
Modèle 37 | 4.3 |
pushing-the-limits-of-semi-supervised | 1.4 |
semi-supervised-speech-recognition-via-local | 7.19 |
conformer-convolution-augmented-transformer | 2 |
end-to-end-asr-from-supervised-to-semi | 2.03 |
conformer-convolution-augmented-transformer | 2.1 |
let-ssms-be-convnets-state-space-modeling | 4.4 |
improving-rnn-transducer-based-asr-with | 2.0 |
espresso-a-fast-end-to-end-neural-speech | 2.8 |
model-unit-exploration-for-sequence-to | 3.60 |
fully-convolutional-speech-recognition | 3.26 |
wav2vec-2-0-a-framework-for-self-supervised | 1.8 |
Modèle 49 | 5.5 |
e-branchformer-branchformer-with-enhanced | 1.81 |
transformer-based-acoustic-modeling-for | 2.26 |
quartznet-deep-automatic-speech-recognition | 2.69 |
improving-end-to-end-speech-recognition-with-1 | 5.42 |
contextnet-improving-convolutional-neural | 2.3 |
specaugment-a-simple-data-augmentation-method | 2.5 |
Modèle 56 | 4.8 |
deep-speech-2-end-to-end-speech-recognition | 5.33 |
mt4ssl-boosting-self-supervised-speech | 3.4 |
asapp-asr-multistream-cnn-and-self-attentive | 1.75 |
jasper-an-end-to-end-convolutional-neural | 2.84 |
samba-asr-state-of-the-art-speech-recognition | 1.17 |
improved-noisy-student-training-for-automatic | 1.7 |
amortized-neural-networks-for-low-latency | 8.6 |
speechstew-simply-mix-all-available-speech | 2.0 |