HyperAI超神経

Zero Shot Video Retrieval On Lsmdc

評価指標

text-to-video R@1
text-to-video R@10
text-to-video R@5

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

比較表
モデル名text-to-video R@1text-to-video R@10text-to-video R@5
one-for-all-video-conversation-is-feasible19.545.035.9
howtocaption-prompting-llms-to-transform17.338.631.7
hitea-hierarchical-temporal-aware-video18.344.236.7
bridgeformer-bridging-video-text-retrieval12.232.225.9
noise-estimation-using-density-estimation-for4.217.111.6
internvideo2-scaling-video-foundation-models33.862.255.9
clip4clip-an-empirical-study-of-clip-for-end15.136.428.5
unmasked-teacher-towards-training-efficient25.250.543.0
hitea-hierarchical-temporal-aware-video15.539.831.1
mplug-2-a-modularized-multi-modal-foundation24.152.043.8
miles-visual-bert-pre-training-with-injected11.130.624.7
internvideo2-scaling-video-foundation-models32.059.452.4
internvideo-general-video-foundation-models17.640.232.4
clover-towards-a-unified-video-language14.738.229.2
seeing-what-you-miss-vision-language-pre17.239.132.4
howtocaption-prompting-llms-to-transform27.754.646.5