HyperAI초신경

홈 뉴스 연구 논문 튜토리얼 데이터셋 백과사전 SOTA LLM 모델 GPU 랭킹 컨퍼런스

한국어

HyperAI초신경

Cross Modal Retrieval On Recipe1M

평가 지표

Image-to-text R@1

Text-to-image R@1

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	Image-to-text R@1	Text-to-image R@1	Paper Title	Repository
VLPCook	73.6	74.7	Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval
X-MRS	64	63.9	Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in Shared Representation Learning	-
H-T	60.0	60.3	Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning
AdaMine	39.8	40.2	Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings
SCAN	54.0	54.9	Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism	-
ACME	51.8	52.8	Learning Cross-Modal Embeddings with Adversarial Networks for Cooking Recipes and Food Images
T-Food (CLIP)	72.3	72.6	Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval
T-Food	68.2	68.3	Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval
VLPCook (R1M+)	74.9	75.6	Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval

0 of 9 row(s) selected.