HyperAIHyperAI

Command Palette

Search for a command to run...

P-MMEval multi-language multi-task Benchmark Dataset

Date

a year ago

Size

12.72 MB

Organization

Paper URL

arxiv.org

*This dataset supports online use.Click here to jump.

The P-MMEval dataset is a large-scale multilingual multi-task benchmark dataset created by Alibaba Group Tongyi Lab in 2024, which aims to comprehensively evaluate the multilingual capabilities of large language models (LLMs).P-MMEVAL: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs"

The dataset contains 3 basic natural language processing (NLP) datasets and 5 advanced capability-specific datasets, covering tasks such as code generation, knowledge understanding, mathematical reasoning, logical reasoning, and instruction following. Through expert translation review, P-MMEval ensures consistent coverage of 10 languages and provides parallel samples across languages. These languages include English, Chinese, Arabic, Spanish, Japanese, Korean, Thai, French, Portuguese, and Vietnamese.

P-MMEval.torrent
Seeding 1Downloading 0Completed 118Total Downloads 161
  • P-MMEval/
    • README.md
      1.48 KB
    • README.txt
      2.97 KB
      • data/
        • P-MMEval.zip
          12.72 MB

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp