HyperAI

Educhat-sft-002-data-osm Educational Dialogue Dataset

Date

4 months ago

Size

1.39 GB

Organization

Publish URL

github.com

The educhat-sft-002-data-osm dataset is a dialogue system dataset focusing on the field of education. It was developed by the EduNLP team of the School of Computer Science and Technology of East China Normal University in 2023. The related paper results are "EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education".

The dataset is a mixture of multiple open-source Chinese and English commands and dialogue data. After deduplication, it contains about 4 million data points, including data from a variety of educational vertical fields, such as open question-and-answer, essay correction, heuristic teaching, emotional support, and course tutoring. Each piece of data consists of a list that stores the dialogue and a system_prompt corresponding to the data. The list stores the dialogue in the order of Q, A.

This dataset is part of the EduChat project, which aims to provide an open-source large-scale language model for intelligent question answering in the education field. Through this dataset, EduChat can provide rich functions such as automatic question setting, homework grading, emotional support, course tutoring, and college entrance examination consultation in educational scenarios, serving the vast number of teachers, students, and parents, and helping to achieve intelligent education that is tailored to students’ aptitude, fair, and warm.

educhat-sft-002-data-osm.torrent
Seeding 1Downloading 1Completed 54Total Downloads 91
  • educhat-sft-002-data-osm/
    • README.md
      1.68 KB
    • README.txt
      3.35 KB
      • data/
        • educhat-sft.zip
          1.39 GB