HyperAI초신경

홈 뉴스 연구 논문 튜토리얼 데이터셋 백과사전 SOTA LLM 모델 GPU 랭킹 컨퍼런스

한국어

HyperAI초신경

Cross Modal Retrieval On Rsitmd

평가 지표

Image-to-text R@1

Mean Recall

text-to-imageR@1

평가 결과

이 벤치마크에서 각 모델의 성능 결과

모델 이름	Image-to-text R@1	Mean Recall	text-to-imageR@1	Paper Title	Repository
GeoRSCLIP-FT	32.30%	51.81%	25.04%	RS5M and GeoRSCLIP: A Large Scale Vision-Language Dataset and A Large Vision-Language Model for Remote Sensing
HarMA (w/ GeoRSCLIP)	32.74%	52.27%	25.62%	Efficient Remote Sensing with Harmonized Transfer Learning and Modality Alignment
GaLR	14.82%	31.41%	11.15%	Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information
AMFMN	10.63%	29.72%	11.51%	Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
PIR	18.14%	38.24%	12.17%	A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval	-
PE-RSITR (MRS-Adapter)	23.67%	44.47%	20.10%	Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval
GLISA	32.08%	50.69%	23.36%	Global–Local Information Soft-Alignment for Cross-Modal Remote-Sensing Image–Text Retrieval	-
SWAN	13.35%	34.11%	11.24%	Reducing Semantic Confusion: Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval	-
RemoteCLIP	28.76%	50.52%	23.76%	RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
DOVE	16.81%	37.73%	12.20%	Direction-Oriented Visual-semantic Embedding Model for Remote Sensing Image-text Retrieval	-

0 of 10 row(s) selected.