HyperAIHyperAI

MetaMathQA Mathematical Reasoning Dataset

Date

2 years ago

Size

84.34 MB

Organization

Hong Kong University of Science and Technology
Southern University of Science and Technology
University of Cambridge

Publish URL

huggingface.co

License

CC BY-SA 4.0

特色图像

Most existing open source LLMs (such as LLaMA-2) have complex reasoning processes and are still unsatisfactory in solving mathematical problems. To bridge this gap, researchers proposed MetaMath, a fine-tuned language model specialized in mathematical reasoning. In order to improve the forward and reverse reasoning capabilities of the model,Researchers from Cambridge, Hong Kong University of Science and Technology, and Huawei proposed the MetaMathQA dataset based on two commonly used mathematical datasets (GSM8K and MATH): a mathematical reasoning dataset with wide coverage and high quality. MetaMathQA consists of 395K forward-reverse math question-answer pairs generated by a large language model. They fine-tuned MetaMath, a large language model focused on mathematical reasoning (forward and reverse), on the MetaMathQA dataset based on LLaMA-2, and achieved SOTA on the mathematical reasoning dataset. The MetaMathQA dataset and MetaMath models of different sizes have been open sourced for researchers to use.

MetaMathQA includes four data augmentation methods:

  1. Answer Augmentation: Given a question, a chain of thoughts that can lead to the correct answer is generated through a large language model as data augmentation.
  2. Rephrasing Question: Given a meta-question, rewrite the question through a large language model and generate a chain of thoughts that leads to the correct answer as data augmentation.
  3. FOBAR Question (FOBAR reverse question enhancement): Given a meta-question, generate a reverse question by masking the number in the condition as x, given the original answer and inferring ×, and generate the correct thinking chain process based on the reverse question to perform data augmentation
  4. Self-Verification Question: Based on FOBAR, the inverse question part is rewritten as a statement through a large language model to perform data augmentation.
MetaMathQA.torrent
Seeding 1Downloading 0Completed 247Total Downloads 667
  • MetaMathQA/
    • README.md
      2.44 KB
    • README.txt
      4.88 KB
      • data/
        • MetaMathQA.zip
          84.34 MB