HyperAI

VidSTG Large-Scale Video Grounding Dataset

Date

3 years ago

Organization

Zhejiang University

Publish URL

github.com

License

其他

Categories

Download Help
特色图像

The VidSTG dataset is a spatio-temporal video grounding dataset built on the VidOR dataset. VidOR is a video relation dataset containing 7,000, 835, and 2,165 videos for training, validation, and testing, respectively. The goal of the spatio-temporal video grounding task is to locate the spatio-temporal part of an uncut video that matches a given sentence describing the target.