HyperAIHyperAI

Video Question Answering On Next Qa Efficient

المقاييس

1:1 Accuracy

النتائج

نتائج أداء النماذج المختلفة على هذا المعيار القياسي

اسم النموذج
1:1 Accuracy
Paper TitleRepository
SeViLA (4 frames)73.8Self-Chained Image-Language Model for Video Localization and Question Answering-
ViLA (3B, 4 frames)74.4ViLA: Efficient Video-Language Alignment for Video Question Answering-
0 of 2 row(s) selected.
Video Question Answering On Next Qa Efficient | SOTA | HyperAI