HyperAI
Accueil
Actualités
Articles de recherche récents
Tutoriels
Ensembles de données
Wiki
SOTA
Modèles LLM
Classement GPU
Événements
Recherche
À propos
Français
HyperAI
Toggle sidebar
Rechercher sur le site...
⌘
K
Accueil
SOTA
Video Retrieval
Video Retrieval On Vatex
Video Retrieval On Vatex
Métriques
text-to-video R@1
text-to-video R@10
text-to-video R@5
Résultats
Résultats de performance de divers modèles sur ce benchmark
Columns
Nom du modèle
text-to-video R@1
text-to-video R@10
text-to-video R@5
Paper Title
Repository
VAST
83.0
99.2
98.2
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
QB-Norm+CLIP2Video
58.8
93.8
-
Cross Modal Retrieval with Querybank Normalisation
CLIP2Video
57.3
90
-
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP
Side4Video
68.8
97.0
93.5
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
VALOR
78.5
98.7
97.1
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Cap4Video
66.6
97.0
93.1
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
InternVideo2-6B
75.5
-
-
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
GRAM
87.7
100
-
Gramian Multimodal Representation Learning and Alignment
TS2-Net
59.1
95.2
-
TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval
LAFF
59.1
91.7
-
Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval
Unmasked Teacher
72
97.8
95.1
Unmasked Teacher: Towards Training-Efficient Video Foundation Models
InternVideo
71.1
-
-
InternVideo: General Video Foundation Models via Generative and Discriminative Learning
TeachCLIP
63.6
96.1
91.9
Holistic Features are almost Sufficient for Text-to-Video Retrieval
0 of 13 row(s) selected.
Previous
Next