HyperAI

WMT 2015 French/English Parallel Texts French/English Parallel Text Dataset

Date

6 years ago

Size

2.42 GB

Organization

Johns Hopkins University
University of Amsterdam
University of Edinburgh

Publish URL

s3.amazonaws.com

WMT 2015 French/English parallel texts is a French/English parallel text dataset used to train translation models. It has more than 20 million French and English sentences.

This dataset was created by Chris Callison-Burch, who crawled millions of web pages and converted French URLs to English URLs using a simple set of heuristics, assuming that these documents are translations of each other.

The dataset was jointly released in 2009 by Johns Hopkins University, the University of Edinburgh, and the University of Amsterdam.

WMT 2015 French-English parallel texts.torrent
Seeding 3Downloading 0Completed 866Total Downloads 1,507
  • WMT 2015 French-English parallel texts/
    • README.md
      1.15 KB
    • README.txt
      2.31 KB
      • data/
        • WMT 2015 FrenchEnglish parallel texts.zip
          2.42 GB