HyperAI
الرئيسية
الأخبار
أحدث الأوراق البحثية
الدروس
مجموعات البيانات
الموسوعة
SOTA
نماذج LLM
لوحة الأداء GPU
الفعاليات
البحث
حول
العربية
HyperAI
Toggle sidebar
البحث في الموقع...
⌘
K
الرئيسية
SOTA
Speech Recognition
Speech Recognition On Wsj Eval92
Speech Recognition On Wsj Eval92
المقاييس
Word Error Rate (WER)
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
Columns
اسم النموذج
Word Error Rate (WER)
Paper Title
Repository
End-to-end LF-MMI
3.0
End-to-end speech recognition using lattice-free MMI
-
Espresso
3.4
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
ConformerXXL-P
1.3
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
-
CNN over RAW speech (wav)
5.6
-
-
tdnn + chain
2.32
Purely sequence-trained neural networks for ASR based on lattice-free MMI
-
CTC-CRF ST-NAS
2.77
Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients
-
Jasper 10x3
6.9
Jasper: An End-to-End Convolutional Neural Acoustic Model
TC-DNN-BLSTM-DNN
3.5
Deep Recurrent Neural Networks for Acoustic Modelling
-
Speechstew 100M
1.3
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network
-
test-set on open vocabulary (i.e. harder), model = HMM-DNN + pNorm*
3.6
-
-
Convolutional Speech Recognition
3.5
Fully Convolutional Speech Recognition
-
CTC-CRF VGG-BLSTM
3.2
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
CTC-CRF 4gram-LM
3.79
CRF-based Single-stage Acoustic Modeling with CTC Topology
Transformer with Relaxed Attention
3.19
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
Deep Speech 2
3.60
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
Task activating prompting generative correction
2.11
Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
-
RobustGER
2.2
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
0 of 17 row(s) selected.
Previous
Next