HyperAI초신경

홈 플랫폼 문서 뉴스 연구 논문 튜토리얼 데이터셋 백과사전 SOTA LLM 모델 GPU 랭킹 컨퍼런스

한국어

HyperAI초신경

Speech Recognition On Aishell 1

평가 지표

Params(M)

Word Error Rate (WER)

평가 결과

이 벤치마크에서 각 모델의 성능 결과

			Paper Title	Repository
Att	-	18.7	End-to-end Speech Recognition with Adaptive Computation Steps	-
CTC/Att	-	6.7	A Comparative Study on Transformer vs RNN in Speech Applications
BRA-E	8.5	6.63	Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition	-
CTC-CRF 4gram-LM	-	6.34	CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
BAT	90	4.97	BAT: Boundary aware transducer for memory-efficient and low-latency ASR
Paraformer	46.3	4.95	FunASR: A Fundamental End-to-End Speech Recognition Toolkit
U2	47	4.72	Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
UMA	44.7	4.7	Unimodal Aggregation for CTC-based Speech Recognition
Lightweight Transducer	45.3	4.31	Lightweight Transducer Based on Frame-Level Criterion
SE-WSBO With LM	46	4.1	Improving Mandarin Speech Recogntion with Block-augmented Transformer
CIF-HKD With LM	47	4.1	Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation
Lightweight Transducer With LM	45.3	4.03	Lightweight Transducer Based on Frame-Level Criterion
Zipformer+CR-CTC (no external language model)	66.2	4.02	CR-CTC: Consistency regularization on CTC for improved speech recognition
Paraformer-large	220	1.95	FunASR: A Fundamental End-to-End Speech Recognition Toolkit
MMSpeech With LM	-	1.9	MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition
Qwen-Audio	-	1.29	Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
Seed-ASR	-	0.68	Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition	-
FireRedASR-AED	1,100	0.55	FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration

0 of 18 row(s) selected.