HyperAIHyperAI超神经
首页资讯论文教程数据集百科SOTALLM 模型天梯GPU 天梯顶会
全站搜索
关于
中文
HyperAIHyperAI超神经
  1. 首页
  2. SOTA
  3. 视觉定位
  4. Visual Grounding On Refcoco Test B

Visual Grounding On Refcoco Test B

评估指标

Accuracy (%)

评测结果

各个模型在此基准测试上的表现结果

模型名称
Accuracy (%)
Paper TitleRepository
XFM (base)79.8Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
X2-VLM (base)78.4X$^2$-VLM: All-In-One Pre-trained Model For Vision-Language Tasks
mPLUG-286.05mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
X-VLM (base)76.91Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts
Florence-2-large-ft92.0Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
X2-VLM (large)81.8X$^2$-VLM: All-In-One Pre-trained Model For Vision-Language Tasks
0 of 6 row(s) selected.
HyperAI

学习、理解、实践,与社区一起构建人工智能的未来

中文

关于

关于我们数据集帮助

产品

资讯教程数据集百科

链接

TVM 中文Apache TVMOpenBayes

© HyperAI超神经

津ICP备17010941号-1京公网安备11010502038810号京公网安备11010502038810号
TwitterBilibili