Date

8 months ago

Size

21.79 GB

Organization

Paper URL

2509.19894

License

MIT

Data composition:

In the supervised fine-tuning (SFT) scenario, a total of 4,766,890 prompts were synthesized, including:

1,188,505 programming task prompts
3,578,385 math task prompts

Citation

@article{zhao2025promptcot2, title = {PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning}, author = {Zhao, Xueliang and Wu, Wei and Guan, Jian and Gong, Zhuocheng and Kong, Lingpeng} journal = {arXiv preprint arXiv:2509.19894}, year = {2025}, url = {https://arxiv.org/abs/2509.19894} }

PromptCoT-2.0-SFT-4.8M.torrent

Seeding 1Downloading 0Completed 69Total Downloads 148

PromptCoT-2.0-SFT-4.8M/
- README.md
  1.53 KB
- README.txt
  3.06 KB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Datasets

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Use this Dataset

Discuss on Discord

Date

8 months ago

Size

21.79 GB

Organization

Paper URL

2509.19894

License

MIT

Data composition:

In the supervised fine-tuning (SFT) scenario, a total of 4,766,890 prompts were synthesized, including:

1,188,505 programming task prompts
3,578,385 math task prompts

Citation

PromptCoT-2.0-SFT-4.8M.torrent

Seeding 1Downloading 0Completed 69Total Downloads 148

PromptCoT-2.0-SFT-4.8M/
- README.md
  1.53 KB
- README.txt
  3.06 KB

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

2 hours ago

SAM 3D Artist Objects 3D Object Reconstruction Dataset

in 2 hours

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

2 hours ago

ChartNet Chart Understanding Multimodal Dataset

18 days ago

World Air Pollution and AQI Dataset

18 days ago

SMOL Multilingual Translation Parallel Dataset

19 days ago

chi-bench Medical Intelligent Agent Benchmark Evaluation Dataset

6 days ago

VisCoR-55K Visual Inference Dataset

a month ago

LongBlocks Long Context Multilingual Question Answering Dataset

a month ago

MathNet Multimodal Mathematical Benchmark Inference Dataset

a month ago

Claw-Eval Real-World Benchmark Dataset

a month ago

QCalEval Quantum Calibration Graph Understanding Dataset

2 months ago

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

a day ago

OpenMementos Context Memory Compressed Dataset

2 months ago

OmniParsingBench Multimodal Parsing Capability Evaluation Dataset

a day ago

MDPBench Multilingual Document Parsing Benchmark Dataset

a day ago

GPT-5.4-step-by-step-reasoning Dataset

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

PromptCoT-2.0-SFT-4.8M Supervised fine-tuning Prompt SFT Dataset

Data composition:

Citation

Build AI with AI

HyperAI Newsletters

Command Palette

PromptCoT-2.0-SFT-4.8M Supervised fine-tuning Prompt SFT Dataset

Data composition:

Citation

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

SAM 3D Artist Objects 3D Object Reconstruction Dataset

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

ChartNet Chart Understanding Multimodal Dataset

World Air Pollution and AQI Dataset

SMOL Multilingual Translation Parallel Dataset

chi-bench Medical Intelligent Agent Benchmark Evaluation Dataset

VisCoR-55K Visual Inference Dataset

LongBlocks Long Context Multilingual Question Answering Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Claw-Eval Real-World Benchmark Dataset

QCalEval Quantum Calibration Graph Understanding Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

OpenMementos Context Memory Compressed Dataset

OmniParsingBench Multimodal Parsing Capability Evaluation Dataset

MDPBench Multilingual Document Parsing Benchmark Dataset

GPT-5.4-step-by-step-reasoning Dataset

Build AI with AI

HyperAI Newsletters

Command Palette

PromptCoT-2.0-SFT-4.8M Supervised fine-tuning Prompt SFT Dataset

Data composition:

Citation

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

SAM 3D Artist Objects 3D Object Reconstruction Dataset

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

ChartNet Chart Understanding Multimodal Dataset

World Air Pollution and AQI Dataset

SMOL Multilingual Translation Parallel Dataset

chi-bench Medical Intelligent Agent Benchmark Evaluation Dataset

VisCoR-55K Visual Inference Dataset

LongBlocks Long Context Multilingual Question Answering Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Claw-Eval Real-World Benchmark Dataset

QCalEval Quantum Calibration Graph Understanding Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

OpenMementos Context Memory Compressed Dataset

OmniParsingBench Multimodal Parsing Capability Evaluation Dataset

MDPBench Multilingual Document Parsing Benchmark Dataset

GPT-5.4-step-by-step-reasoning Dataset

Build AI with AI

HyperAI Newsletters

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

SAM 3D Artist Objects 3D Object Reconstruction Dataset

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

ChartNet Chart Understanding Multimodal Dataset

World Air Pollution and AQI Dataset

SMOL Multilingual Translation Parallel Dataset

chi-bench Medical Intelligent Agent Benchmark Evaluation Dataset

VisCoR-55K Visual Inference Dataset

LongBlocks Long Context Multilingual Question Answering Dataset

MathNet Multimodal Mathematical Benchmark Inference Dataset

Claw-Eval Real-World Benchmark Dataset

QCalEval Quantum Calibration Graph Understanding Dataset

RSRCC Remote Sensing Area Change Understanding Benchmark Dataset

OpenMementos Context Memory Compressed Dataset

OmniParsingBench Multimodal Parsing Capability Evaluation Dataset

MDPBench Multilingual Document Parsing Benchmark Dataset

GPT-5.4-step-by-step-reasoning Dataset

Related Datasets

MAKIEVAL Multilingual Cultural Knowledge Assessment Dataset

SAM 3D Artist Objects 3D Object Reconstruction Dataset

Nemotron-SFT-Math-v4 Mathematical Inference SFT Dataset

ChartNet Chart Understanding Multimodal Dataset

World Air Pollution and AQI Dataset

SMOL Multilingual Translation Parallel Dataset

chi-bench Medical Intelligent Agent Benchmark Evaluation Dataset