HyperAI초신경
홈
뉴스
최신 연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
홈
SOTA
Zero Shot Video Retrieval
Zero Shot Video Retrieval On Youcook2
Zero Shot Video Retrieval On Youcook2
평가 지표
text-to-video Median Rank
text-to-video R@1
text-to-video R@10
text-to-video R@5
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
text-to-video Median Rank
text-to-video R@1
text-to-video R@10
text-to-video R@5
Paper Title
Repository
VAST, HowToCaption-finetuned
8
19.7
53.9
43.6
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
VideoCOca
-
20.3
53.3
43.0
VideoCoCa: Video-Text Modeling with Zero-Shot Transfer from Contrastive Captioners
-
TACo
-
19.9
55.7
43.2
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
-
OmniVec2
-
26.1
70.8
54.1
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
-
VideoCLIP
-
22.7
63.1
50.4
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
HowToCaption
15
13.4
44.1
33.1
HowToCaption: Prompting LLMs to Transform Video Annotations at Scale
VATT-MBS
-
-
45.5
-
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
MIL-NCE
-
15.1
51.2
38.0
End-to-End Learning of Visual Representations from Uncurated Instructional Videos
Norton
-
24.2
64.1
51.9
Multi-granularity Correspondence Learning from Long-term Noisy Videos
0 of 9 row(s) selected.
Previous
Next