Visual Question Answering On Mmbench
평가 지표
GPT-3.5 score
평가 결과
이 벤치마크에서 각 모델의 성능 결과
모델 이름 | GPT-3.5 score | Paper Title | Repository |
---|---|---|---|
Video-LaVIT | 67.3 | Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization | |
DreamLLM-7B | 49.9 | DreamLLM: Synergistic Multimodal Comprehension and Creation | |
CuMo-7B | 73.0 | CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts | |
LLaVA-InternLM2-ViT + MoSLoRA | 73.8 | Mixture-of-Subspaces in Low-Rank Adaptation | |
LLaVA-LLaMA3-8B-ViT + MoSLoRA | 73.0 | Mixture-of-Subspaces in Low-Rank Adaptation |
0 of 5 row(s) selected.