Whisper v2 +demucs | 3.2 | 66.1 | 34.9 | 43.3 | Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | |
Whisper v3 +demucs | 3.2 | 69.4 | 30.9 | 44.9 | Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | |
Whisper v2 +demucs +lang | - | 65.6 | 36.1 | 38.2 | Lyrics Transcription for Humans: A Readability-Aware Benchmark | |
OWSM v3.1 +demucs +lang | - | 40.9 | 22.3 | 78.5 | Lyrics Transcription for Humans: A Readability-Aware Benchmark | |
Whisper v3 +demucs +lang | - | 69.3 | 32.0 | 44.9 | Lyrics Transcription for Humans: A Readability-Aware Benchmark | |