KodCode-V1 Encoding Synthetic Dataset
Date
2 months ago
Size
1.99 GB
Publish URL
License
CC BY 4.0
Categories
KodCode was released in 2025 by researchers from Microsoft GenAI, the University of Washington, and the University of Texas at Austin.KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding".
The dataset is the largest fully synthetic open-source dataset that provides verifiable solutions and tests for coding tasks. It contains 12 different subsets covering various fields (from algorithms to package-specific knowledge) and difficulty levels (from basic coding exercises to interviews and competitive programming challenges), and is designed for supervised fine-tuning (SFT) and RL tuning.

KodCode-V1.torrent
Seeding 1Downloading 2Completed 24Total Downloads 26