Video Captioning On Youcook2
评估指标
BLEU-4
CIDEr
METEOR
ROUGE-L
评测结果
各个模型在此基准测试上的表现结果
比较表格
模型名称 | BLEU-4 | CIDEr | METEOR | ROUGE-L |
---|---|---|---|---|
howtocaption-prompting-llms-to-transform | 8.8 | 116.4 | 15.9 | 37.3 |
vast-a-vision-audio-subtitle-text-omni-1 | 18.2 | 1.99 | - | - |
videobert-a-joint-model-for-video-and | 4.33 | 0.55 | 11.94 | 28.80 |
ma-lmm-memory-augmented-large-multimodal | - | 1.31 | 17.6 | - |
omnivl-one-foundation-model-for-image | 8.72 | 1.16 | 14.83 | 36.09 |
cosa-concatenated-sample-pretrained-vision | 10.1 | 1.31 | - | - |
meltr-meta-loss-transformer-for-learning-to | 17.92 | 1.90 | 22.56 | 47.04 |
univilm-a-unified-video-and-language-pre | 17.35 | 1.81 | 22.35 | 46.52 |
multimodal-pretraining-for-dense-video | 12.04 | 1.22 | 18.32 | 39.03 |
text-with-knowledge-graph-augmented | 11.7 | 1.33 | 14.8 | 40.2 |
vlm-task-agnostic-video-language-model-pre | 12.27 | 1.3869 | 18.22 | 41.51 |
video-text-modeling-with-zero-shot-transfer | 14.2 | 1.28 | - | 37.7 |
end-to-end-dense-video-captioning-with-masked | 4.38 | 0.38 | 11.55 | 27.44 |
coot-cooperative-hierarchical-transformer-for | 11.30 | 0.57 | 19.85 | 37.94 |