HyperAI초신경

Visual Question Answering On Mm Vet V2

평가 지표

GPT-4 score

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름GPT-4 score
모델 145.5±0.1
모델 268.4±0.3
claude-3-5-sonnet-model-card-addendum71.8±0.2
cogvlm-visual-expert-for-pretrained-language45.1±0.2
모델 550.9±0.1
qwen-vl-a-frontier-large-vision-language55.8±0.2
generative-multimodal-models-are-in-context38.0±0.1
mimic-it-multi-modal-in-context-instruction23.2±0.1
모델 977.1±0.1
모델 1063.8±0.2
gemini-a-family-of-highly-capable-multimodal-157.2±0.2
openflamingo-an-open-source-framework-for17.6±0.2
gpt-4-technical-report-172.1±0.2
improved-baselines-with-visual-instruction33.2±0.1
모델 1555.8±0.2
how-far-are-we-to-gpt-4v-closing-the-gap-to51.5±0.2
internlm-xcomposer2-mastering-free-form-text42.5±0.3
gpt-4-technical-report-171.0±0.2
improved-baselines-with-visual-instruction28.3±0.2
qwen2-vl-enhancing-vision-language-model-s66.9±0.3
gpt-4-technical-report-166.8±0.3
cogagent-a-visual-language-model-for-gui34.7±0.2
gpt-4-technical-report-166.3±0.2
gemini-1-5-unlocking-multimodal-understanding66.9±0.2