HyperAI超神经
首页
资讯
最新论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
首页
SOTA
Mmr Total
Mmr Total On Mrr Benchmark
Mmr Total On Mrr Benchmark
评估指标
Total Column Score
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Total Column Score
Paper Title
Repository
InternVL2-8B
368
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Idefics-80B
139
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
Idefics-2-8B
256
What matters when building vision-language models?
-
InternVL2-1B
237
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
GPT-4o
457
GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding
-
Phi-3-Vision
397
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
-
Qwen-vl-max
366
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
LLaVA-NEXT-13B
335
Visual Instruction Tuning
LLaVA-NEXT-34B
412
Visual Instruction Tuning
GPT-4V
415
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
Claude 3.5 Sonnet
463
Claude 3.5 Sonnet Model Card Addendum
-
Monkey-Chat-7B
214
Monkey: Image Resolution and Text Label Are Important Things for Large Multi-modal Models
Qwen-vl-plus
310
Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond
LLaVA-1.5-13B
243
Visual Instruction Tuning
0 of 14 row(s) selected.
Previous
Next