Command Palette
Search for a command to run...
AM-DeepSeek-R1-Distilled-1.4M 大规模通用推理任务数据集
AM-DeepSeek-R1-Distilled-1.4M 是由 a-m-team 于 2025 年 3 月发布的一个大规模通用推理任务数据集,相关论文成果为「1.4 Million Open-Source Distilled Reasoning Dataset to Empower Large Language Model Training」。 该数据集包含约 140 万条数据条目,涵盖了数学、代码、科学问答和通用聊天等多种类型的问题。这些数据经过精心选择、语义去重和严格的清洗处理,确保了数据的高质量和挑战性。数据集中的每个条目都包含了丰富的思考痕迹,这些痕迹不仅为模型提供了推理过程的示例,还帮助模型更好地理解和生成复杂的推理任务解决方案。 AM-DeepSeek-R1-Distilled-1.4M 数据集的发布,旨在为自然语言处理和推理任务领域提供一个强大的工具,尤其是用于训练和优化大型语言模型的推理能力。它可以帮助模型在数学、代码、科学问答等关键领域中提升表现,从而更好地应对各种复杂的推理任务。
Citation
如果您觉得我们的工作对您的研究有所帮助,欢迎给我们点个星 :star:, 并引用我们的工作:pencil:
“BibTeX @misc{tian2025correctanswersequaldistillation, title={Not All Correct Answers Are Equal: Why Your Distillation Source Matters}, author={Xiaoyu Tian and Yunjie Ji and Haotian Wang and Shuaiting Chen and Sitong Zhao and Yiping Peng and Han Zhao and Xiangang Li}, year={2025}, eprint={2505.14464}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2505.14464}, } @misc{ji2025amthinkingv1advancingfrontierreasoning, title={AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale}, author={Yunjie Ji and Xiaoyu Tian and Sitong Zhao and Haotian Wang and Shuaiting Chen and Yiping Peng and Han Zhao and Xiangang Li}, year={2025}, eprint={2505.08311}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2505.08311}, } @misc{tian2025exploringpotentialofflinerl, title={Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study}, author={Xiaoyu Tian and Sitong Zhao and Haotian Wang and Shuaiting Chen and Yiping Peng and Yunjie Ji and Han Zhao and Xiangang Li}, year={2025}, eprint={2505.02142}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2505.02142}, } @misc{tian2025deepdistillenhancingllmreasoning, title={DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training}, author={Xiaoyu Tian and Sitong Zhao and Haotian Wang and Shuaiting Chen and Yiping Peng and Yunjie Ji and Han Zhao and Xiangang Li}, year={2025}, eprint={2504.17565}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.17565}, } @misc{wang2025leveragingreasoningmodelanswers, title={Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability}, author={Haotian Wang and Han Zhao and Shuaiting Chen and Xiaoyu Tian and Sitong Zhao and Yunjie Ji and Yiping Peng and Xiangang Li}, year={2025}, eprint={2504.09639}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.09639}, } @misc{ji2025difficultyawarestagedreinforcementlearning, title={How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs’ Reasoning Capabilities: A Preliminary Experimental Study}, author={Yunjie Ji and Sitong Zhao and Xiaoyu Tian and Haotian Wang and Shuaiting Chen and Yiping Peng and Han Zhao and Xiangang Li}, year={2025}, eprint={2504.00829}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2504.00829}, } @misc{tian2025thinktwiceenhancingllm, title={Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking}, author={Xiaoyu Tian and Sitong Zhao and Haotian Wang and Shuaiting Chen and Yunjie Ji and Yiping Peng and Han Zhao and Xiangang Li}, year={2025}, eprint={2503.19855}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2503.19855}, } @misc{zhao202514millionopensourcedistilled, title={1.4 Million Open-Source Distilled Reasoning Dataset to Empower Large Language Model Training}, author={Han Zhao and Haotian Wang and Yiping Peng and Sitong Zhao and Xiaoyu Tian and Shuaiting Chen and Yunjie Ji and Xiangang Li}, year={2025}, eprint={2503.19633}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2503.19633}, } “