HyperAI초신경
홈
뉴스
최신 연구 논문
튜토리얼
데이터셋
백과사전
SOTA
LLM 모델
GPU 랭킹
컨퍼런스
전체 검색
소개
한국어
HyperAI초신경
Toggle sidebar
전체 사이트 검색...
⌘
K
홈
SOTA
Moment Retrieval
Moment Retrieval On Charades Sta
Moment Retrieval On Charades Sta
평가 지표
R@1 IoU=0.5
R@1 IoU=0.7
평가 결과
이 벤치마크에서 각 모델의 성능 결과
Columns
모델 이름
R@1 IoU=0.5
R@1 IoU=0.7
Paper Title
Repository
video-mamba-suite
57.18
36.05
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
SG-DETR (w/ PT)
71.10
52.80
Saliency-Guided DETR for Moment Retrieval and Highlight Detection
VideoChat-T (ZS)
48.7
24.0
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
-
SimVTP
44.7
26.3
SimVTP: Simple Video Text Pre-training with Masked Autoencoders
-
CG-DETR
58.44
36.34
Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding
UnLoc-L
60.8
38.4
UnLoc: A Unified Framework for Video Localization Tasks
UMT (VO)
49.35
26.16
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection
LLaVA-MR
70.65
49.58
LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval
-
SG-DETR
70.20
49.50
Saliency-Guided DETR for Moment Retrieval and Highlight Detection
LD-DETR
62.58
41.56
LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection
UVCOM
59.25
36.64
Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
Moment-DETR
53.63
31.37
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
UMT (VA)
48.31
29.25
UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection
InternVideo2-6B
70.03
48.95
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
VideoLights-B-pt
61.96
41.05
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
InternVideo2-1B
68.36
45.03
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
BM-DETR
59.48
38.33
Background-aware Moment Detection for Video Moment Retrieval
UniMD+Sync.
63.98
44.46
UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection
Moment-DETR w/ PT (on 10K HowTo100M videos)
55.65
34.17
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries
UnLoc-B
58.1
35.4
UnLoc: A Unified Framework for Video Localization Tasks
0 of 25 row(s) selected.
Previous
Next