HyperAI

M³IT: Multi-mode Multi-language Instruction Tuning Dataset

Date

2 years ago

Organization

The University of Hong Kong

Publish URL

m3-it.github.io

Download Help

The dataset consists of 40 datasets.This includes 2.4 million instances and 400 manually written task instructions.and reformatted into a vision-to-text structure. The dataset compiles various tasks from classic vision-language tasks, including captioning, visual question answering (VQA), visual conditional generation, reasoning, and classification.