HyperAI

Temporal Relation Extraction On Vinoground

Métriques

Group Score
Text Score
Video Score

Résultats

Résultats de performance de divers modèles sur ce benchmark

Tableau comparatif
Nom du modèleGroup ScoreText ScoreVideo Score
Modèle 124.65438.2
Modèle 26.221.825.6
Modèle 36.821.826.2
Modèle 43.82321.2
qwen2-vl-enhancing-vision-language-model-s15.240.232.4
Modèle 63559.251
Modèle 76.22422.4
Modèle 810.632.828.8
imagebind-one-embedding-space-to-bind-them0.69.43.4
llava-onevision-easy-visual-task-transfer21.848.435.2
vtimellm-empower-llm-to-grasp-video-moments5.219.427
ma-lmm-memory-augmented-large-multimodal6.823.825.6
llava-onevision-easy-visual-task-transfer14.641.629.4
gemini-1-5-unlocking-multimodal-understanding12.43727.6
gemini-1-5-unlocking-multimodal-understanding10.235.822.6
internlm-xcomposer-2-5-a-versatile-large9.628.827.8
videoclip-contrastive-pre-training-for-zero1.2172.8
video-llava-learning-united-visual-16.624.825.8
languagebind-extending-video-language1.210.65
videollama-2-advancing-spatial-temporal8.436.221.8
internlm-xcomposer-2-5-a-versatile-large930.828.4
Modèle 225.225.822.2
qwen2-vl-enhancing-vision-language-model-s17.450.432.6
2408-0180011.232.629.2