HyperAIHyperAI

Command Palette

Search for a command to run...

Nemotron-Math-v2 Mathematical Inference Dataset

Date

2 days ago

Organization

NVIDIA

License

CC BY-SA 4.0

Nemotron-Math-v2 is a mathematical inference dataset released by NVIDIA Corporation in 2025. Related research papers include... Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision It is primarily used to train LLMs to perform structured mathematical reasoning, to study the differences between tool-enhanced reasoning and pure language reasoning, and to build long-context or multi-track reasoning systems.

This dataset contains approximately 347,000 high-quality mathematical problems and 7 million model-generated inference trajectories. Each problem is solved in six configurations: high/medium/low inference depth and with or without Python TIR, and the answers are validated via a pipeline using an LLM as the arbiter.

Data Fields:

  • Problem: Problem statements extracted from sources such as OpenMathReasoning and MathStackExchange.
  • Messages: The user's and assistant's conversation log, used for LLM training.
  • expected_answer: The extracted answer or the majority vote answer generated by the model.
  • metadata: Pass rate under different reasoning and tool usage scenarios
  • data_source: Data source is AoPS or StackExchange-Math
  • tool: The tool definition used, or empty.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp