HyperAI

Visual Grounding On Refcoco Val

Metrics

Accuracy (%)

Results

Performance results of various models on this benchmark

Comparison Table
Model NameAccuracy (%)
multi-grained-vision-language-pre-training84.51
x-2-vlm-all-in-one-pre-trained-model-for85.2
toward-building-general-foundation-models-for86.1
mplug-2-a-modularized-multi-modal-foundation90.33
x-2-vlm-all-in-one-pre-trained-model-for87.6
florence-2-advancing-a-unified-representation93.4