HyperAI超神经
首页
资讯
最新论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
首页
SOTA
Zero Shot Video Retrieval
Zero Shot Video Retrieval On Youcook2
Zero Shot Video Retrieval On Youcook2
评估指标
text-to-video Median Rank
text-to-video R@1
text-to-video R@10
text-to-video R@5
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
text-to-video Median Rank
text-to-video R@1
text-to-video R@10
text-to-video R@5
Paper Title
Repository
VAST, HowToCaption-finetuned
8
19.7
53.9
43.6
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
VideoCOca
-
20.3
53.3
43.0
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
-
TACo
-
19.9
55.7
43.2
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
-
OmniVec2
-
26.1
70.8
54.1
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
-
VideoCLIP
-
22.7
63.1
50.4
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
HowToCaption
15
13.4
44.1
33.1
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
VATT-MBS
-
-
45.5
-
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
MIL-NCE
-
15.1
51.2
38.0
End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Norton
-
24.2
64.1
51.9
Multi-granularity Correspondence Learning from Long-term Noisy Videos
0 of 9 row(s) selected.
Previous
Next