HyperAI
HyperAI
Home
Console
Docs
News
Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
Terms of Service
Privacy Policy
English
HyperAI
HyperAI
Toggle Sidebar
Search the site…
⌘
K
Command Palette
Search for a command to run...
Console
Home
SOTA
Phrase Grounding
Phrase Grounding On Flickr30K Entities Test
Phrase Grounding On Flickr30K Entities Test
Metrics
R@1
Results
Performance results of various models on this benchmark
Columns
Model Name
R@1
Paper Title
GLIPv2
87.7
GLIPv2: Unifying Localization and Vision-Language Understanding
FIBER-B
87.4
Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone
GLIP
87.1
Grounded Language-Image Pre-training
PEVL
84.4
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models
MDETR-ENB5
84.3
MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
DIGN
78.73
Disentangled Motif-aware Graph Learning for Phrase Grounding
LCMCG
76.74
Learning Cross-modal Context Graph for Visual Grounding
Soft-Label Chain CRF (SL-CCRF)
74.69
Phrase Grounding by Soft-Label Chain Conditional Random Field
DDPN (ResNet-101)
73.3
Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
VisualBERT
71.33
VisualBERT: A Simple and Performant Baseline for Vision and Language
BAN (Bottom-Up detector)
69.69
Bilinear Attention Networks
MCB
48.69
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
GroundeR 100.0% annot.
48.38
Grounding of Textual Phrases in Images by Reconstruction
DSPE
43.89
Learning Deep Structure-Preserving Image-Text Embeddings
CCA - Fast RCNN
41.77
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
CCA - VGG19
30.83
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
SCRC
27.8
Natural Language Object Retrieval
CCA
25.30
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
0 of 18 row(s) selected.
Previous
Next