HyperAI超神経

Cross Modal Retrieval On Rsicd

評価指標

Image-to-text R@1
Mean Recall
text-to-image R@1

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名Image-to-text R@1Mean Recalltext-to-image R@1
exploring-a-fine-grained-multiscale-method5.21%15.53%4.08%
parameter-efficient-transfer-learning-for-114.13%31.12%11.63%
efficient-remote-sensing-with-harmonized20.52%38.95%15.84%
global-local-information-soft-alignment-for20.68%37.69%14.73%
a-prior-instruction-representation-framework9.88%24.46%6.97%
direction-oriented-visual-semantic-embedding8.66%22.72%6.04%
reducing-semantic-confusion-scene-aware7.41%20.61%5.56%
rs5m-a-large-scale-vision-language-dataset21.13%38.87%15.59%
remoteclip-a-vision-language-foundation-model18.39%36.35%14.73%
remote-sensing-cross-modal-text-image6.59%18.96%4.69%