Date

4 years ago

Organization

Publish URL

github.com

Paper URL

arxiv.org

License

Other

Tags

Multimodal

Video Understanding

Visual Question Answering

Violin stands for VIdeO-and-Language INference, which can be used for multimodal understanding tasks of videos and texts. The dataset contains 95,322 video-hypothesis pairs from 15,887 video clips, covering more than 582 hours of video. These video clips contain rich content with different temporal dynamics, event changes, and human interactions. The data is collected from two sources: (i) popular TV shows, and (ii) movie clips from YouTube channels.

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Discuss on Discord

Date

4 years ago

Organization

Publish URL

github.com

Paper URL

arxiv.org

License

Other

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Violin Video and Language Inference Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

Violin Video and Language Inference Dataset

Related Datasets

RoVid-X Robot Video Generation Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

MCD-rPPG Multi-Camera Remote Photoplethysmography Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

Violin Video and Language Inference Dataset

Related Datasets

RoVid-X Robot Video Generation Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

MCD-rPPG Multi-Camera Remote Photoplethysmography Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

RoVid-X Robot Video Generation Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

MCD-rPPG Multi-Camera Remote Photoplethysmography Dataset

Related Datasets

RoVid-X Robot Video Generation Dataset

TransPhy3D Transparent Reflection Synthesis Video Dataset

MCD-rPPG Multi-Camera Remote Photoplethysmography Dataset