HyperAI

Video Retrieval On Youcook2

Métriques

text-to-video Median Rank
text-to-video R@1
text-to-video R@10

Résultats

Résultats de performance de divers modèles sur ce benchmark

Tableau comparatif
Nom du modèletext-to-video Median Ranktext-to-video R@1text-to-video R@10
coot-cooperative-hierarchical-transformer-for916.752.3
howto100m-learning-a-text-video-embedding-by248.235.3
associating-neural-word-embeddings-with-deep754.621.6
semantic-role-aware-correlation-transformer775.320.8
rome-role-aware-mixture-of-expert-transformer536.325.2
videoclip-contrastive-pre-training-for-zero-32.275.0
vlm-task-agnostic-video-language-model-pre427.0569.38
taco-token-aware-cascade-contrastive-learning429.672.7
vast-a-vision-audio-subtitle-text-omni-1-50.480.8
omnivec-learning-robust-representations-with--64.2
meltr-meta-loss-transformer-for-learning-to333.774.8
omnivec-learning-robust-representations-with--70.8
univilm-a-unified-video-and-language-pre428.970.0
videoclip-contrastive-pre-training-for-zero-22.763.1
video-text-modeling-with-zero-shot-transfer-21.755.2
mdmmt-2-multidomain-multimodal-transformer3.032.074.8