Whisper v3 +demucs | 3.6 | 52.4 | 28.7 | 61.5 | Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | |
Whisper v2 +demucs +lang | - | 52.6 | 34.3 | 34.9 | Lyrics Transcription for Humans: A Readability-Aware Benchmark | |
Whisper v2 +demucs | 7.1 | 56.4 | 17.2 | 38.8 | Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | |
Whisper v3 +demucs +lang | - | 54.7 | 34.4 | 58.6 | Lyrics Transcription for Humans: A Readability-Aware Benchmark | |