Date

4 years ago

Organization

Publish URL

vidsitu.org

Paper URL

arxiv.org

License

Other

Tags

Video Captioning

Video Understanding

VidSitu is a dataset for semantic role labeling in videos (VidSRL). VidSitu is a large-scale video understanding data source, including 29K 10-second movie clips, annotated with verbs and semantic roles in 2-second units. Entities are co-referenced in each event of the clip, and events are connected by event-event relations. The clips in VidSitu come from a large collection of movies (3K), and are selected to be complex (4.2 unique verbs in a single video) and diverse (200 verbs with more than 100 tokens each).

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.