HyperAIHyperAI

Command Palette

Search for a command to run...

VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval

Dhiman Paul Md Rizwan Parvez Nabeel Mohammed Shafin Rahman

Abstract

Video Highlight Detection and Moment Retrieval (HD/MR) are essential in videoanalysis. Recent joint prediction transformer models often overlook theircross-task dynamics and video-text alignment and refinement. Moreover, mostmodels typically use limited, uni-directional attention mechanisms, resultingin weakly integrated representations and suboptimal performance in capturingthe interdependence between video and text modalities. Although large-languageand vision-language models (LLM/LVLMs) have gained prominence across variousdomains, their application in this field remains relatively underexplored. Herewe propose VideoLights, a novel HD/MR framework addressing these limitationsthrough (i) Convolutional Projection and Feature Refinement modules with analignment loss for better video-text feature alignment, (ii) Bi-DirectionalCross-Modal Fusion network for strongly coupled query-aware cliprepresentations, and (iii) Uni-directional joint-task feedback mechanismenhancing both tasks through correlation. In addition, (iv) we introduce hardpositive/negative losses for adaptive error penalization and improved learning,and (v) leverage LVLMs like BLIP-2 for enhanced multimodal feature integrationand intelligent pretraining using synthetic data generated from LVLMs.Comprehensive experiments on QVHighlights, TVSum, and Charades-STA benchmarksdemonstrate state-of-the-art performance. Codes and models are available athttps://github.com/dpaul06/VideoLights .


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp