HyperAI초신경

Text Summarization On Reddit Tifu

평가 지표

ROUGE-1
ROUGE-2
ROUGE-L

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름ROUGE-1ROUGE-2ROUGE-L
extractive-summarization-as-text-matching25.096.1720.13
muppet-massive-multi-task-representations30.311.2524.92
summareranker-a-multi-task-mixture-of-experts-129.839.523.47
calibrating-sequence-likelihood-improves32.0311.1325.51
better-fine-tuning-by-reducing30.3110.9824.74