HyperAIHyperAI

Highlight Detection On Qvhighlights

Metrics

Hit@1
mAP

Results

Performance results of various models on this benchmark

Model Name
Hit@1
mAP
Paper TitleRepository
SG-DETR69.1343.76Saliency-Guided DETR for Moment Retrieval and Highlight Detection-
SG-DETR (w/ PT)71.0044.70Saliency-Guided DETR for Moment Retrieval and Highlight Detection-
VideoLights-B-pt70.5642.84VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval-
HL-CLIP70.6041.94Unleash the Potential of CLIP for Video Highlight Detection-
UniVTG (w/ PT)66.2840.54UniVTG: Towards Unified Video-Language Temporal Grounding-
LLMEPET65.6940.33Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval-
Moment-DETR w/ PT60.1737.43QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries-
QD-DETR (only Video w/ PT)61.91-Query-Dependent Video Representation for Moment Retrieval and Highlight Detection-
CG-DETR (w/ PT)66.6040.71Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding-
QD-DETR62.8739.04Query-Dependent Video Representation for Moment Retrieval and Highlight Detection-
QD-DETR (w/ PT)62.2738.52Query-Dependent Video Representation for Moment Retrieval and Highlight Detection-
NumPro70.7140.54Number it: Temporal Grounding Videos like Flipping Manga-
QD-DETR (only Video)62.4038.94Query-Dependent Video Representation for Moment Retrieval and Highlight Detection-
UMT (w. PT)-39.12UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection-
UMT-38.18UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection-
R^2-Tuning64.2040.75$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding-
CG-DETR66.2140.33Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding-
FlashVTG71.0144.09FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding-
UniVTG60.9638.20UniVTG: Towards Unified Video-Language Temporal Grounding-
0 of 19 row(s) selected.