Image Retrieval On Coco
المقاييس
Recall@10
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
اسم النموذج | Recall@10 | Paper Title | Repository |
---|---|---|---|
Oscar | 98.3 | Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks | |
BLIP-2 ViT-G (fine-tuned) | 92.6 | BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models | |
VisualSparta | 96.3 | VisualSparta: An Embarrassingly Simple Approach to Large-scale Text-to-Image Search with Weighted Bag-of-words | |
FLAVA (zero-shot) | - | FLAVA: A Foundational Language And Vision Alignment Model | |
CLIP (zero-shot) | - | FLAVA: A Foundational Language And Vision Alignment Model | |
BLIP-2 ViT-L (fine-tuned) | 91.8 | BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models |
0 of 6 row(s) selected.