HyperAI초신경

Speech Recognition On Aishell 1

평가 지표

Params(M)
Word Error Rate (WER)

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
Params(M)
Word Error Rate (WER)
Paper TitleRepository
U2474.72Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
Zipformer+CR-CTC (no external language model)66.24.02CR-CTC: Consistency regularization on CTC for improved speech recognition
Paraformer46.34.95FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Qwen-Audio-1.29Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Lightweight Transducer With LM45.34.03Lightweight Transducer Based on Frame-Level Criterion
Paraformer-large2201.95FunASR: A Fundamental End-to-End Speech Recognition Toolkit
Att-18.7End-to-end Speech Recognition with Adaptive Computation Steps-
SE-WSBO With LM464.1Improving Mandarin Speech Recogntion with Block-augmented Transformer
CTC/Att-6.7A Comparative Study on Transformer vs RNN in Speech Applications
CTC-CRF 4gram-LM-6.34CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
UMA44.74.7Unimodal Aggregation for CTC-based Speech Recognition
MMSpeech With LM-1.9MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition
CIF-HKD With LM474.1Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation
BRA-E8.56.63Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition-
FireRedASR-AED1,1000.55FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration-
Seed-ASR-0.68Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition-
Lightweight Transducer45.34.31Lightweight Transducer Based on Frame-Level Criterion
BAT904.97BAT: Boundary aware transducer for memory-efficient and low-latency ASR
0 of 18 row(s) selected.