HyperAI초신경

홈 뉴스 연구 논문 튜토리얼 데이터셋 백과사전 SOTA LLM 모델 GPU 랭킹 컨퍼런스

한국어

HyperAI초신경

Visual Entailment On Snli Ve Test

평가 지표

Accuracy

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	Accuracy	Paper Title	Repository
UNITER (Large)	78.98	UNITER: UNiversal Image-TExt Representation Learning
EVE-ROI*	70.47	Visual Entailment: A Novel Task for Fine-Grained Image Understanding
MAD (Single Model, Formerly CLIP-TD)	80.32	Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks	-
SimVLM	86.32	SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
SOHO	84.95	Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Prompt Tuning	90.12	Prompt Tuning for Generative Multimodal Pretrained Models
CoCa	87.1	CoCa: Contrastive Captioners are Image-Text Foundation Models
OFA	91.2	OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

0 of 8 row(s) selected.