Image Retrieval On Coco
评估指标
Recall@10
评测结果
各个模型在此基准测试上的表现结果
模型名称 | Recall@10 | Paper Title | Repository |
---|---|---|---|
Oscar | 98.3 | Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks | |
BLIP-2 ViT-G (fine-tuned) | 92.6 | BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models | |
VisualSparta | 96.3 | VisualSparta: An Embarrassingly Simple Approach to Large-scale Text-to-Image Search with Weighted Bag-of-words | |
FLAVA (zero-shot) | - | FLAVA: A Foundational Language And Vision Alignment Model | |
CLIP (zero-shot) | - | FLAVA: A Foundational Language And Vision Alignment Model | |
BLIP-2 ViT-L (fine-tuned) | 91.8 | BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models |
0 of 6 row(s) selected.