HyperAI
HyperAI
Accueil
Actualités
Articles de recherche récents
Tutoriels
Ensembles de données
Wiki
SOTA
Modèles LLM
Classement GPU
Événements
Recherche
À propos
Français
HyperAI
HyperAI
Toggle sidebar
Rechercher sur le site...
⌘
K
Accueil
SOTA
Réponse zéro-shot à des questions vidéo
Zero Shot Video Question Answer On Tvqa
Zero Shot Video Question Answer On Tvqa
Métriques
Accuracy
Résultats
Résultats de performance de divers modèles sur ce benchmark
Columns
Nom du modèle
Accuracy
Paper Title
Repository
VideoChat2 (no speech)
40.6
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
-
FrozenBiLM (with speech)
59.7
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
-
FrozenBILM (no speech)
29.7
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
-
InternVideo (no speech)
35.9
InternVideo: General Video Foundation Models via Generative and Discriminative Learning
-
IG-VLM (no speech, GPT-4V)
57.8
An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM
-
VideoChat_HD_mistral (no speech)
50.6
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
-
VideoChat_mistral (no speech)
46.4
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
-
SEVILA (no speech)
38.2
Self-Chained Image-Language Model for Video Localization and Question Answering
-
MiniGPT4-video-7B
54.21
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens
-
0 of 9 row(s) selected.
Previous
Next
Zero Shot Video Question Answer On Tvqa | SOTA | HyperAI