HyperAI超神经

首页资讯论文教程数据集百科 SOTA LLM 模型天梯 GPU 天梯顶会

中文

HyperAI超神经

Automatic Lyrics Transcription On Jam Alt 2

评估指标

Case Error Rate

Line break F-1

Punctuation F-1

Word Error Rate (WER)

评测结果

各个模型在此基准测试上的表现结果

模型名称	Case Error Rate	Line break F-1	Punctuation F-1	Word Error Rate (WER)	Paper Title	Repository
Whisper v3 +demucs	3.6	52.4	28.7	61.5	Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark
OWSM v3.1 +demucs +lang	-	33.5	9.0	70.8	Lyrics Transcription for Humans: A Readability-Aware Benchmark
AudioShake v3	-	81.5	56.7	12.6	Lyrics Transcription for Humans: A Readability-Aware Benchmark
Whisper v2 +demucs +lang	-	52.6	34.3	34.9	Lyrics Transcription for Humans: A Readability-Aware Benchmark
AudioShake v1	4.1	82.7	47.8	22.5	Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark
Whisper v2 +lang	-	71.5	52.5	21.9	Lyrics Transcription for Humans: A Readability-Aware Benchmark
Whisper v3 +lang	-	74.5	44.5	22.4	Lyrics Transcription for Humans: A Readability-Aware Benchmark
Whisper v3 +demucs	-	52.3	32.4	61.5	Lyrics Transcription for Humans: A Readability-Aware Benchmark
Whisper v2 +demucs	7.1	56.4	17.2	38.8	Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark
Whisper v3	-	73.7	42.5	28.6	Lyrics Transcription for Humans: A Readability-Aware Benchmark
Whisper v2	6.5	71.7	50.0	25.7	Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark
Whisper v2 +demucs	-	56.6	40.4	39.6	Lyrics Transcription for Humans: A Readability-Aware Benchmark
Whisper v2	-	71.7	52.8	25.8	Lyrics Transcription for Humans: A Readability-Aware Benchmark
Whisper v3	5.0	73.7	41.9	28.6	Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark
OWSM v3.1 +lang	-	30.2	8.8	73.3	Lyrics Transcription for Humans: A Readability-Aware Benchmark
Whisper v3 +demucs +lang	-	54.7	34.4	58.6	Lyrics Transcription for Humans: A Readability-Aware Benchmark

0 of 16 row(s) selected.