Command Palette
Search for a command to run...
MCD Multimodal Code Generation Dataset
Date
Size
Paper URL
Multimodal Coding Dataset (MCD) is a large-scale dataset proposed by Microsoft Research, Peking University and Southern University of Science and Technology and released in 2025. The related paper results are "VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models".
The dataset contains a total of approximately 598,000 high-quality samples/pairs, organized in an instruction-following format, covering multiple input modalities (text, images, code) and output modalities (code, answers, explanations), and is suitable for multimodal code understanding and generation tasks.
The data includes:
- Enhanced HTML code (HTML): about 200,000 code-screenshot pairs, focusing on visual effects and structural optimization.
- Chart: About 210,000 image-code pairs for image-to-code reproduction.
- Question and Answer (QA): About 59,000 code-question-answer pairs, with questions and answers centered around code.
- Algorithm: Approximately 129,000 algorithm coding problems and instruction-following samples.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.