HyperAI

GOAT Arithmetic Task Fine-tuning Dataset

Date

6 months ago

Size

89.46 MB

Organization

National University of Singapore

Publish URL

github.com

This dataset was released by researchers from the National University of Singapore in 2023.Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic TasksThe dataset contains two files: dataset.json and dataset.ipynb. The dataset.json file contains about 1.7 million synthetic data generated by dataset.ipynb for arithmetic tasks.

Each instance in the dataset contains the following:

  • instruction: Instructions created by humans, formed by inserting arithmetic expressions into randomly selected templates and adding some natural language noise. It serves as a prompt for fine-tuning the instructions of the model.
  • enter: Randomly generated arithmetic expressions. It can be used to replace "instructions" for training when we want to focus on arithmetic and avoid the influence of natural language.
  • Output: The target output of the model learning. It contains chained thoughts (CoTs) for multi-digit multiplication and division.
  • Answer: Direct numerical answers to arithmetic tasks. It can be used to test learning ability of various subtasks.
goat.torrent
Seeding 1Downloading 0Completed 59Total Downloads 32
  • goat/
    • README.md
      1.68 KB
    • README.txt
      3.35 KB
      • data/
        • goat.zip
          89.46 MB