HyperAI

MMMLU Multi-lingual Multi-task Language Understanding Dataset

* This dataset supports online use.Click here to jump.

The Multilingual Massive Multitask Language Understanding (MMMLU) dataset is an open source dataset launched by OpenAI in 2024 to evaluate and improve the performance of artificial intelligence models in different linguistic, cognitive, and cultural contexts. MMMLU is built on the Massive Multitask Language Understanding (MMLU) benchmark, which is a common sense indicator achieved by artificial intelligence models. It contains tasks in 57 different subject areas, ranging from elementary knowledge to advanced professional disciplines such as law, physics, history, and computer science.

The research team used professional human translators to translate MMLU's test set into 14 languages. Relying on human translation for this evaluation can increase confidence in the accuracy of the translation, especially for low-resource languages such as Yoruba. By using professional translators to translate, MMMLU can ensure the accuracy and reliability of the dataset, which is critical for evaluating the capabilities of AI models in cross-language tasks.

The main functions of MMMLU include multilingual assessment, multi-task ability testing, cross-cultural understanding, improving model diversity, and supporting research and development. Technical principles include dataset construction, professional translation, multilingual support, assessment tool development, and performance analysis.

MMMLU's application scenarios include language model evaluation, machine translation systems, cross-cultural communication, educational technology, and international business. The release of the dataset will have a profound impact on the field of natural language processing (NLP) research. MMMLU provides important resource support for both theoretical exploration and the development of practical applications.

MMMLU.torrent
Seeding 2Downloading 0Completed 63Total Downloads 201
  • MMMLU/
    • README.md
      2.19 KB
    • README.txt
      4.38 KB
      • data/
        • MMMLU.zip
          31.05 MB