HyperAIHyperAI超神经
首页资讯论文教程数据集百科SOTALLM 模型天梯GPU 天梯顶会
全站搜索
关于
中文
HyperAIHyperAI超神经
  1. 首页
  2. SOTA
  3. 图像字幕生成
  4. Image Captioning On Nocaps Val Near Domain

Image Captioning On Nocaps Val Near Domain

评估指标

CIDEr
Pre-train (#images)
SPICE

评测结果

各个模型在此基准测试上的表现结果

模型名称
CIDEr
Pre-train (#images)
SPICE
Paper TitleRepository
BLIP-2 ViT-G OPT 6.7B (zero-shot)119.21.1B15.3BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
OmniVL108.314M14.9OmniVL:One Foundation Model for Image-Language and Video-Language Tasks-
VinVL96.15.7M13.8VinVL: Revisiting Visual Representations in Vision-Language Models
BLIP-2 ViT-G FlanT5 XL (zero-shot)120.21.1B15.9BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Enc-Dec88.3-12.1Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
BLIP_ViT-L112.1129M 14.9BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
LEMON_large113.3200M 15.1Scaling Up Vision-Language Pre-training for Image Captioning-
SimVLM110.91.8B-SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
BLIP_CapFilt-L108.6129M14.8BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
BLIP-2 ViT-G OPT 2.7B (zero-shot)117.81.1B15.4BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
0 of 10 row(s) selected.
HyperAI

学习、理解、实践,与社区一起构建人工智能的未来

中文

关于

关于我们数据集帮助

产品

资讯教程数据集百科

链接

TVM 中文Apache TVMOpenBayes

© HyperAI超神经

津ICP备17010941号-1京公网安备11010502038810号京公网安备11010502038810号
TwitterBilibili