Speech Recognition On Lrs3 Ted
Metriken
Word Error Rate (WER)
Ergebnisse
Leistungsergebnisse verschiedener Modelle zu diesem Benchmark
Vergleichstabelle
Modellname | Word Error Rate (WER) |
---|---|
whisper-flamingo-integrating-visual-features | 0.68 |
jointly-learning-visual-and-auditory-speech | 1.4 |
large-language-models-are-strong-audio-visual | 0.81 |
learning-audio-visual-speech-representation-1 | 1.3 |