HyperAI

XQuAD Cross-Lingual Question Answering Dataset

XQuAD (Cross-Lingual Question Answering Dataset) is a benchmark dataset for evaluating cross-lingual question answering performance. SQuAD v1.1 (Rajpurkar et al., 2016)The dataset consists of a subset of 240 passages and 1,190 question-answer pairs from the development set, which have been professionally translated into ten languages: Spanish, German, Greek, Russian, Turkish, Arabic, Vietnamese, Thai, Chinese, and Hindi. Thus, the dataset is fully parallelizable across 11 languages.

For details on how the dataset was created, please refer to the paper "On the Cross-lingual Transferability of Monolingual Representations".

XQuAD.torrent
Seeding 2Downloading 0Completed 121Total Downloads 287
  • XQuAD/
    • README.md
      1.29 KB
    • README.txt
      2.58 KB
      • data/
        • CC-BY-SA4.0.txt
          17.28 KB
        • README.md
          24.8 KB
        • xquad.ar.json
          1.53 MB
        • xquad.de.json
          2.17 MB
        • xquad.el.json
          4 MB
        • xquad.en.json
          4.58 MB
        • xquad.es.json
          5.24 MB
        • xquad.hi.json
          6.84 MB
        • xquad.ro.json
          7.47 MB
        • xquad.ru.json
          9.28 MB
        • xquad.th.json
          11 MB
        • xquad.tr.json
          11.7 MB
        • xquad.vi.json
          12.57 MB
        • xquad.zh.json
          13.34 MB
        • xquad_example.png
          14 MB