HyperAIHyperAI

Command Palette

Search for a command to run...

pyMethods2Test Programming Language Processing Dataset

Date

9 months ago

Size

3.74 GB

Organization

Publish URL

zenodo.org

Paper URL

arxiv.org

Tags

The pyMethods2Test dataset was created by researchers at the University of Nebraska–Lincoln in 2025. It contains a large number of open source unit test methods and corresponding focus maps, and is designed to generate effective unit test cases for Python code, filling the gap in the Python language in terms of large test datasets.pyMethods2Test: A Dataset of Python Tests Mapped to Focal Methods", which is widely used to train large language models (LLMs) to generate good Python unit test cases, providing LLMs with rich training data so that they can learn how to generate tests for Python code.

The dataset is constructed by mining 88,846 Python projects on GitHub that use the Pytest and unittest frameworks, and a collection of 22,662,037 test methods and 2,198,378 focus maps is constructed.

The dataset contains more than 22 million mappings of test methods to focus methods, and provides detailed context information for each mapping, such as test file path, focus file path, class name, method name, line number, etc. It is stored in JSON format for easy processing; and a script for generating focus method context is also provided.

The data is stored in two ZIP files. If you only want to use the pre-mined focus data, unzip focal-data.zip file (about 2 GB after decompression). Larger raw-data.zip The file (about 42 GB after decompression) contains the raw data used to generate the focus data, such as classes and methods extracted from the repository.

pyMethods2Test.torrent
Seeding 1Downloading 0Completed 82Total Downloads 165
  • pyMethods2Test/
    • README.md
      2.14 KB
    • README.txt
      4.29 KB
      • data/
        • pyMethods2Test.zip
          3.74 GB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp