HyperAI
HyperAI초신경
홈
플랫폼
문서
뉴스
연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
Command Palette
Search for a command to run...
홈
SOTA
음성 인식
Speech Recognition On Librispeech Test Other
Speech Recognition On Librispeech Test Other
평가 지표
Word Error Rate (WER)
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
Word Error Rate (WER)
Paper Title
Repository
Local Prior Matching (Large Model)
20.84
Semi-Supervised Speech Recognition via Local Prior Matching
Snips
16.5
Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
Local Prior Matching (Large Model, ConvLM LM)
15.28
Semi-Supervised Speech Recognition via Local Prior Matching
Deep Speech 2
13.25
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
TDNN + pNorm + speed up/down speech
12.5
-
-
CTC-CRF 4gram-LM
10.65
CRF-based Single-stage Acoustic Modeling with CTC Topology
-
Convolutional Speech Recognition
10.47
Fully Convolutional Speech Recognition
-
MT4SSL
9.6
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Jasper DR 10x5
8.79
Jasper: An End-to-End Convolutional Neural Acoustic Model
Espresso
8.7
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Jasper DR 10x5 (+ Time/Freq Masks)
7.84
Jasper: An End-to-End Convolutional Neural Acoustic Model
tdnn + chain + rnnlm rescoring
7.63
Neural Network Language Modeling with Letter-based Features and Importance Sampling
-
QuartzNet15x5
7.25
QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions
Conformer with Relaxed Attention
6.85
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
LAS (no LM)
6.5
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Squeezeformer (L)
5.97
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
LAS + SpecAugment
5.8
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Multi-Stream Self-Attention With Dilated 1D Convolutions
5.80
State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions
Transformer
5.7
A Comparative Study on Transformer vs RNN in Speech Applications
LSTM Transducer
5.6
Librispeech Transducer Model with Internal Language Model Prior Correction
0 of 53 row(s) selected.
Previous
Next