HyperAIHyperAI

Moment Retrieval On Qvhighlights

Metriken

R@1 IoU=0.5
R@1 IoU=0.7
mAP

Ergebnisse

Leistungsergebnisse verschiedener Modelle zu diesem Benchmark

Modellname
R@1 IoU=0.5
R@1 IoU=0.7
mAP
Paper TitleRepository
SG-DETR72.2056.6054.1073.2055.80Saliency-Guided DETR for Moment Retrieval and Highlight Detection-
LLMEPET66.7349.9444.0565.7643.91Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval-
DenoiseLoc59.2745.07---Boundary-Denoising for Video Activity Localization-
BAM-DETR62.7148.6445.3664.5746.33BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos-
UMT--36.12--UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection-
VideoLights-B-pt70.3655.2547.9469.5349.17VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval-
UniVTG (w/ PT)65.4350.0643.6364.0645.02UniVTG: Towards Unified Video-Language Temporal Grounding-
UVCOM (w/ PT ASR Captions)64.5348.3143.864.7843.65Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection-
QD-DETR (only Video)62.4044.9839.8662.5239.88Query-Dependent Video Representation for Moment Retrieval and Highlight Detection-
R^2-Tuning68.0349.3546.1769.0447.56$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding-
FlashVTG70.6953.9652.0072.3353.85FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding-
SeViLA-Localizer54.536.532.3----
QD-DETR (w/ audio)63.0645.1040.1963.0440.10Query-Dependent Video Representation for Moment Retrieval and Highlight Detection-
BAM-DETR (w/ audio)64.0748.1246.9165.6147.51BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos-
CG-DETR65.4348.3842.8664.5142.77Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding-
UnLoc-L66.146.7---UnLoc: A Unified Framework for Video Localization Tasks-
LD-DETR 66.8051.0446.4167.61 46.99LD-DETR: Loop Decoder DEtection TRansformer for Video Moment Retrieval and Highlight Detection-
LA-DETR63.9451.1047.9365.6549.44Length-Aware DETR for Robust Moment Retrieval-
LLaVA-MR76.5961.4852.7369.4154.40LLaVA-MR: Large Language-and-Vision Assistant for Video Moment Retrieval-
BM-DETR60.1243.0540.0863.0840.18Background-aware Moment Detection for Video Moment Retrieval-
0 of 32 row(s) selected.
Moment Retrieval On Qvhighlights | SOTA | HyperAI