HyperAI

DS-1000 Code Generation Benchmark Dataset

Download Help

DS-1000 is a benchmark dataset in the field of code generation jointly released by the University of Hong Kong, Peking University and other universities in 2022. It focuses on code generation tasks in the field of data science. The relevant paper results are "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".

The dataset contains 1k real data science problems from StackOverflow, covering 7 widely used data science libraries in Python, such as NumPy, Pandas, TensorFlow, etc. These problems not only reflect the diversity and practicality in the real world, but also ensure the reliability and correctness of the solutions through multi-standard automatic evaluation methods. When building DS-1000, special attention was paid to preventing the model from simply memorizing the training data, and through surface and semantic perturbations and difficult rewriting, it ensured that the model must truly understand the problem in order to provide the correct answer.

The structure of the data set is very clear. The questions under each library are presented in two prompt formats: Completion and Insertion. Each question contains meta information, input data, reference code, and test code. This design makes the data set both complete and verifiable. DS-1000 has a wide range of application scenarios, from automatic code completion to education and learning to performance evaluation.