Question Answering On Next Qa Open Ended
평가 지표
Accuracy
Confidence Score
평가 결과
이 벤치마크에서 각 모델의 성능 결과
모델 이름 | Accuracy | Confidence Score | Paper Title | Repository |
---|---|---|---|---|
MovieChat | 49.9 | 2.7 | MovieChat: From Dense Token to Sparse Memory for Long Video Understanding | |
Video-ChatGPT | 54.6 | 3.2 | Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models | |
VideoChat | 56.6 | 3.2 | VideoChat: Chat-Centric Video Understanding | |
Vista-LLaMA | 60.7 | 3.4 | Vista-LLaMA: Reliable Video Narrator via Equal Distance to Visual Tokens | - |
Flash-VStream | 61.6 | 3.4 | Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams | |
MovieChat+ | 54.8 | 3.0 | MovieChat+: Question-aware Sparse Memory for Long Video Question Answering |
0 of 6 row(s) selected.