HyperAI
Home
News
Latest Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
English
HyperAI
Toggle sidebar
Search the site…
⌘
K
Home
SOTA
Cross Modal Retrieval
Cross Modal Retrieval On Recipe1M
Cross Modal Retrieval On Recipe1M
Metrics
Image-to-text R@1
Text-to-image R@1
Results
Performance results of various models on this benchmark
Columns
Model Name
Image-to-text R@1
Text-to-image R@1
Paper Title
Repository
VLPCook
73.6
74.7
Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval
X-MRS
64
63.9
Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in Shared Representation Learning
H-T
60.0
60.3
Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
AdaMine
39.8
40.2
Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings
SCAN
54.0
54.9
Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism
-
ACME
51.8
52.8
Learning Cross-Modal Embeddings with Adversarial Networks for Cooking Recipes and Food Images
T-Food (CLIP)
72.3
72.6
Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval
-
T-Food
68.2
68.3
Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval
-
VLPCook (R1M+)
74.9
75.6
Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval
0 of 9 row(s) selected.
Previous
Next