Command Palette
Search for a command to run...
T-Wix Russian SFT Dataset
T-Wix is a Russian SFT dataset, and the related paper is "From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning", which aims to enhance the model's capabilities from solving algorithmic and mathematical problems to dialogue, logical thinking and reasoning patterns.
The dataset contains 499,598 Russian language samples, including 468,614 general samples covering a variety of areas, including mathematics, science, programming, general knowledge, instruction following, role-playing, etc. The reasoning samples contain 30,984 data points, focusing on advanced mathematics and science problems and providing detailed reasoning traces.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.