HyperAI

VATEX Video Captioning Dataset

Date

3 years ago

Size

4.31 GB

Organization

University of California, Santa Barbara

License

CC BY 4.0

特色图像

VATEX, short for Video And TEXt, is a large multilingual video description dataset that includes 41,250 videos and 825,000 sets of Chinese and English subtitles. Among the subtitle texts, there are more than 206,000 English-Chinese translation pairs.

This dataset is mainly used for:

-Multi-language video subtitle generation

- Video subtitle translation

VATEX.torrent
Seeding 1Downloading 1Completed 554Total Downloads 1,090
  • VATEX/
    • README.md
      1.11 KB
    • README.txt
      2.22 KB
      • data/
        • private_test.zip
          665.06 MB
        • public_test.zip
          1.27 GB
        • trainval.zip
          4.24 GB
        • vatex_private_test_without_annotations.json
          4.24 GB
        • vatex_public_test_english_v1.1.json
          4.25 GB
        • vatex_training_v1.0.json
          4.3 GB
        • vatex_validation_v1.0.json
          4.31 GB