HyperAI
HyperAI超神経
ホーム
プラットフォーム
ドキュメント
ニュース
論文
チュートリアル
データセット
百科事典
SOTA
LLMモデル
GPU ランキング
学会
検索
サイトについて
日本語
HyperAI
HyperAI超神経
Toggle sidebar
サイトを検索…
⌘
K
Command Palette
Search for a command to run...
ホーム
SOTA
音声認識
Speech Recognition On Librispeech Test Other
Speech Recognition On Librispeech Test Other
評価指標
Word Error Rate (WER)
評価結果
このベンチマークにおける各モデルのパフォーマンス結果
Columns
モデル名
Word Error Rate (WER)
Paper Title
Repository
Local Prior Matching (Large Model)
20.84
Semi-Supervised Speech Recognition via Local Prior Matching
Snips
16.5
Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
Local Prior Matching (Large Model, ConvLM LM)
15.28
Semi-Supervised Speech Recognition via Local Prior Matching
Deep Speech 2
13.25
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
TDNN + pNorm + speed up/down speech
12.5
-
-
CTC-CRF 4gram-LM
10.65
CRF-based Single-stage Acoustic Modeling with CTC Topology
-
Convolutional Speech Recognition
10.47
Fully Convolutional Speech Recognition
-
MT4SSL
9.6
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Jasper DR 10x5
8.79
Jasper: An End-to-End Convolutional Neural Acoustic Model
Espresso
8.7
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Jasper DR 10x5 (+ Time/Freq Masks)
7.84
Jasper: An End-to-End Convolutional Neural Acoustic Model
tdnn + chain + rnnlm rescoring
7.63
Neural Network Language Modeling with Letter-based Features and Importance Sampling
-
QuartzNet15x5
7.25
QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions
Conformer with Relaxed Attention
6.85
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
LAS (no LM)
6.5
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Squeezeformer (L)
5.97
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
LAS + SpecAugment
5.8
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Multi-Stream Self-Attention With Dilated 1D Convolutions
5.80
State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions
Transformer
5.7
A Comparative Study on Transformer vs RNN in Speech Applications
LSTM Transducer
5.6
Librispeech Transducer Model with Internal Language Model Prior Correction
0 of 53 row(s) selected.
Previous
Next