Image To Text Retrieval On Whoops
المقاييس
Specificity
النتائج
نتائج أداء النماذج المختلفة على هذا المعيار القياسي
اسم النموذج | Specificity | Paper Title | Repository |
---|---|---|---|
BLIP2 FlanT5-XXL (Text-only FT) | 94 | Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images | - |
BLIP2 FlanT5-XL (Fine-tuned) | 81 | Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images | - |
CoCa ViT-L-14 MSCOCO | 72 | Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images | - |
BLIP2 FlanT5-XXL (Zero-shot) | 71 | Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images | - |
BLIP2 FlanT5-XXL (Fine-tuned) | 84 | Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images | - |
BLIP Large | 77 | Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images | - |
CLIP ViT-L/14 | 70 | Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images | - |
0 of 7 row(s) selected.