HyperAI超神経

Video Retrieval On Condensed Movies

評価指標

text-to-video R@1
text-to-video R@10
text-to-video R@5

評価結果

このベンチマークにおける各モデルのパフォーマンス結果

モデル名
text-to-video R@1
text-to-video R@10
text-to-video R@5
Paper TitleRepository
VINDLU18.444.336.4 VindLU: A Recipe for Effective Video-and-Language Pretraining
LF-VILA 13.641.832.5Long-Form Video-Language Pre-Training with Multimodal Temporal Contrastive Learning
TESTA (ViT-B/16)24.955.146.5TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
0 of 3 row(s) selected.