DPO-zh-en-emoji Emoji Question Answering Dataset
Date
Size
Publish URL
Categories
* This dataset is available online.Click here to jump.
Dataset Introduction
The DPO-zh-en-emoji dataset is a dataset specially designed for fine-tuning large language models launched by shareAI in 2024, where "DPO" stands for Direct Preference Optimization. This dataset contains a large amount of question-answer pair data. Each question has two versions of the answer, Chinese and English, and the answers are integrated with fun and humorous elements, including the use of emojis. The research team carefully selected some questions from Zhihu, logical reasoning, and idiots as queries, and used the llama3 70b instruct model to sample and generate a Chinese version of the answer and an English version of the answer for each query. Such a design helps to activate the language style preferences of multilingual chat models and improve the quality of model-generated content and its compliance with human preferences.