CTC + Transformer LM rescoring | 2.10 | Faster, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces | - |
Hybrid model with Transformer rescoring | 2.3 | RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data Augmentation | |
Conv + Transformer AM + Iterative Pseudo-Labeling (n-gram LM + Transformer Rescoring) | 2.10 | Iterative Pseudo-Labeling for Speech Recognition | - |
Zipformer+CR-CTC (no external language model) | 2.02 | CR-CTC: Consistency regularization on CTC for improved speech recognition | |
Transformer+Time reduction+Self Knowledge distillation | 1.9 | Transformer-based ASR Incorporating Time-reduction Layer and Fine-tuning with Self-Knowledge Distillation | - |
Zipformer+pruned transducer w/ CR-CTC (no external language model) | 1.88 | CR-CTC: Consistency regularization on CTC for improved speech recognition | |
Conv + Transformer AM (ConvLM with Transformer Rescoring) (LS only) | 2.31 | End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures | |
Multi-Stream Self-Attention With Dilated 1D Convolutions | 2.20 | State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions | |
Zipformer+pruned transducer (no external language model) | 2.00 | Zipformer: A faster and better encoder for automatic speech recognition | |