HyperAI

Video Narrative Grounding is a task that links visual and linguistic information, aiming to associate video narratives with specific video segments. This task takes as input a video containing text descriptions and the positions of nouns marked within these descriptions, and requires generating segmentation masks for the corresponding target objects of each marked noun in every frame. By accurately locating objects within videos, Video Narrative Grounding has significant application value in areas such as multimodal understanding, video annotation, and content retrieval.

No Data

No benchmark data available for this task

HyperAI

No Data

No benchmark data available for this task

Command Palette

Video Narrative Grounding

Command Palette

Video Narrative Grounding

Command Palette

Video Narrative Grounding