HyperAI초신경

Phrase Grounding On Flickr30K Entities Test

평가 지표

R@1

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름R@1
multimodal-compact-bilinear-pooling-for48.69
disentangled-motif-aware-graph-learning-for78.73
visualbert-a-simple-and-performant-baseline71.33
learning-cross-modal-context-graph-for-visual76.74
phrase-grounding-by-soft-label-chain74.69
natural-language-object-retrieval27.8
rethinking-diversified-and-discriminative73.3
learning-deep-structure-preserving-image-text43.89
glipv2-unifying-localization-and-vision87.7
flickr30k-entities-collecting-region-to25.30
flickr30k-entities-collecting-region-to41.77
mdetr-modulated-detection-for-end-to-end84.3
grounded-language-image-pre-training87.1
pevl-position-enhanced-pre-training-and84.4
flickr30k-entities-collecting-region-to30.83
bilinear-attention-networks69.69
coarse-to-fine-vision-language-pre-training87.4
grounding-of-textual-phrases-in-images-by48.38