HyperAI초신경
홈
뉴스
최신 연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
홈
SOTA
Visual Grounding
Visual Grounding On Refcoco Testa
Visual Grounding On Refcoco Testa
평가 지표
IoU
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
IoU
Paper Title
Repository
HYDRA
61.1
HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning
XFM (base)
-
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
X2-VLM (large)
-
X$^2$-VLM: All-In-One Pre-trained Model For Vision-Language Tasks
X2-VLM (base)
-
X$^2$-VLM: All-In-One Pre-trained Model For Vision-Language Tasks
X-VLM (base)
-
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts
mPLUG-2
-
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Florence-2-large-ft
-
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
0 of 7 row(s) selected.
Previous
Next