HyperAI초신경

Video Retrieval On Msvd

평가 지표

text-to-video R@1
video-to-text R@1

평가 결과

이 벤치마크에서 각 모델의 성능 결과

비교 표
모델 이름text-to-video R@1video-to-text R@1
internvideo2-scaling-video-foundation-models61.485.2
internvideo-general-video-foundation-models58.476.3
vlab-enhancing-video-language-pre-training-by57.5-
improving-video-text-retrieval-by-multi51.869.3
diffusionret-generative-text-video-retrieval47.960.3
x-clip-end-to-end-multi-grained-contrastive50.466.8
mdmmt-2-multidomain-multimodal-transformer56.8-
prototype-based-aleatoric-uncertainty-147.368.9
clip4clip-an-empirical-study-of-clip-for-end46.262.0
diffusionret-generative-text-video-retrieval46.661.9
a-straightforward-framework-for-video3759.9
cap4video-what-can-auxiliary-captions-do-for51.870.0
hunyuan-tvr-for-text-video-retrivial59.073.0
use-what-you-have-video-retrieval-using19.8-
vid-tldr-training-free-token-merging-for57.982.7
lightweight-attentional-feature-fusion-for45.4-
cross-modal-retrieval-with-querybank48.0-
x-pool-cross-modal-language-video-attention47.266.4
centerclip-token-clustering-for-efficient50.668.4
dual-modal-attention-enhanced-text-video48.7-
hunyuan-tvr-for-text-video-retrivial58.269.1
side4video-spatial-temporal-side-network-for56.1-
frozen-in-time-a-joint-video-and-image33.7-
noise-estimation-using-density-estimation-for20.3-