Use this Dataset

Discuss on Discord

Date

7 years ago

Size

867.36 GB

Organization

Publish URL

looking-to-listen.github.io

Tags

Natural Language Processing

AVSpeech is a new, large-scale audio-visual dataset consisting of video clips of speech without interfering background noise. The clips are 3-10 seconds long, and in each clip, the voice heard in the original soundtrack belongs to the only person visible speaking in the video. The dataset contains approximately 4,700 hours of video clips from 290,000 YouTube videos, covering a wide variety of people, languages, and facial poses.

AVSpeech.torrent

Seeding 3Downloading 1Completed 2,861Total Downloads 4,571

AVSpeech/
- data.z01
  97.91 GB
- data.z02
  195.56 GB
- data.z03
  293.22 GB
- data.z04
  390.88 GB
- data.z05
  488.53 GB
- data.z06
  586.19 GB
- data.z07
  683.84 GB
- data.z08
  781.5 GB
- data.zip
  867.35 GB
- README.md
  1.17 KB
- README.txt
  2.34 KB
- download.sh
  867.35 GB
- avspeech_train.csv
  128.33 MB
- avspeech_train.part0.csv
  153.99 MB
- avspeech_train.part1.csv
  179.66 MB
- avspeech_train.part2.csv
  205.33 MB
- avspeech_train.part3.csv
  230.99 MB
- avspeech_train.part4.csv
  256.66 MB
- parallel-20190822.tar.bz2
  867.35 GB
- parallel-20190822.tar.bz2.sig
  867.35 GB

This dataset is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp