HyperAIHyperAI

Command Palette

Search for a command to run...

Alpaca-Cleaned Instruction fine-tuning Dataset

Date

a year ago

Size

13.98 MB

The Alpaca-Cleaned dataset is a cleaned version of the original Alpaca dataset released by Stanford University in 2024. The original Alpaca is a dataset of 52,000 instructions and demonstrations generated by the engine of OpenAI (text-davinci-003). This instruction data can be used to perform instruction tuning on language models, making them better at following instructions.

This dataset solves some problems in the original Alpaca, such as hallucinatory answers, merged instructions, empty outputs, and inconsistent input fields, thereby improving the quality and consistency of the data. The Alpaca-Cleaned dataset has a variety of application scenarios, including text generation, question-answering systems, natural language understanding, and code understanding and generation. Its features include quality optimization, performance improvement, rich model resources, and open source and community support. It encourages community participation, continuous updates and improvements, and promotes the development of the NLP field.

Alpaca-Cleaned.torrent
Seeding 2Downloading 0Completed 240Total Downloads 260
  • Alpaca-Cleaned/
    • README.md
      1.57 KB
    • README.txt
      3.15 KB
      • data/
        • Alpaca-Cleaned.zip
          13.98 MB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Alpaca-Cleaned Instruction fine-tuning Dataset | Datasets | HyperAI