HyperAI초신경

Mmr Total On Mrr Benchmark

평가 지표

Total Column Score

평가 결과

이 벤치마크에서 각 모델의 성능 결과

		Paper Title
Claude 3.5 Sonnet	463	Claude 3.5 Sonnet Model Card Addendum
GPT-4o	457	GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding
GPT-4V	415	The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
LLaVA-NEXT-34B	412	Visual Instruction Tuning
Phi-3-Vision	397	Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
InternVL2-8B	368	InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Qwen-vl-max	366	Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
LLaVA-NEXT-13B	335	Visual Instruction Tuning
Qwen-vl-plus	310	Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
Idefics-2-8B	256	What matters when building vision-language models?
LLaVA-1.5-13B	243	Visual Instruction Tuning
InternVL2-1B	237	InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Monkey-Chat-7B	214	Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Idefics-80B	139	OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents

0 of 14 row(s) selected.