Video Based Generative Performance 4
Métriques
gpt-score
Résultats
Résultats de performance de divers modèles sur ce benchmark
Tableau comparatif
Nom du modèle | gpt-score |
---|---|
video-chatgpt-towards-detailed-video | 2.52 |
video-llama-an-instruction-tuned-audio-visual | 2.18 |
minigpt4-video-advancing-multimodal-llms-for | 3.02 |
videochat-chat-centric-video-understanding | 2.50 |
llama-adapter-v2-parameter-efficient-visual | 2.32 |
st-llm-large-language-models-are-effective-1 | 3.05 |
slowfast-llava-a-strong-training-free | 2.96 |
moviechat-from-dense-token-to-sparse-memory | 2.93 |
chat-univi-unified-visual-representation | 2.91 |
vtimellm-empower-llm-to-grasp-video-moments | 3.10 |
ts-llava-constructing-visual-tokens-through | 3.03 |
one-for-all-video-conversation-is-feasible | 2.46 |
mvbench-a-comprehensive-multi-modal-video | 2.88 |
videogpt-integrating-image-and-video-encoders | 3.18 |
one-for-all-video-conversation-is-feasible | 2.69 |
pllava-parameter-free-llava-extension-from-1 | 3.20 |
mvbench-a-comprehensive-multi-modal-video | 2.86 |
ppllava-varied-video-sequence-understanding | 3.56 |