HyperAI초신경

홈 뉴스 연구 논문 튜토리얼 데이터셋 백과사전 SOTA LLM 모델 GPU 랭킹 컨퍼런스

한국어

HyperAI초신경

Visual Grounding On Refcoco Testa

평가 지표

IoU

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	IoU	Paper Title	Repository
HYDRA	61.1	HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning
XFM (base)	-	Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
X2-VLM (large)	-	X$^2$-VLM: All-In-One Pre-trained Model For Vision-Language Tasks
X2-VLM (base)	-	X$^2$-VLM: All-In-One Pre-trained Model For Vision-Language Tasks
X-VLM (base)	-	Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts
mPLUG-2	-	mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Florence-2-large-ft	-	Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

0 of 7 row(s) selected.