HyperAIHyperAI초신경
홈뉴스연구 논문튜토리얼데이터셋백과사전SOTALLM 모델GPU 랭킹컨퍼런스
전체 검색
소개
한국어
HyperAIHyperAI초신경
  1. 홈
  2. SOTA
  3. 시각적 질문 응답 (VQA)
  4. Visual Question Answering On Vcr Q A Test

Visual Question Answering On Vcr Q A Test

평가 지표

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름
Accuracy
Paper TitleRepository
VL-BERTLARGE75.8VL-BERT: Pre-training of Generic Visual-Linguistic Representations
MAD (Single Model, Formerly CLIP-TD)79.6Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks-
UNITER (Large)77.3UNITER: UNiversal Image-TExt Representation Learning
GPT4RoI89.4GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
VisualBERT71.6VisualBERT: A Simple and Performant Baseline for Vision and Language
ERNIE-ViL-large(ensemble of 15 models)81.6ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph-
UNITER-large (10 ensemble)79.8UNITER: UNiversal Image-TExt Representation Learning
OFA-X71.2Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations
OFA-X-MT62Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations
VL-T575.3Unifying Vision-and-Language Tasks via Text Generation
KVL-BERTLARGE76.4KVL-BERT: Knowledge Enhanced Visual-and-Linguistic BERT for Visual Commonsense Reasoning-
0 of 11 row(s) selected.
HyperAI

학습, 이해, 실천, 커뮤니티와 함께 인공지능의 미래를 구축하다

한국어

소개

회사 소개데이터셋 도움말

제품

뉴스튜토리얼데이터셋백과사전

링크

TVM 한국어Apache TVMOpenBayes

© HyperAI초신경

TwitterBilibili