AVSpeech – Audiovisual Speech Dataset
Date
6 years ago
Size
867.36 GB
Publish URL
Categories
AVSpeech is a new, large-scale audio-visual dataset consisting of video clips of speech without interfering background noise. The clips are 3-10 seconds long, and in each clip, the voice heard in the original soundtrack belongs to the only person visible speaking in the video.
The dataset contains approximately 4,700 hours of video clips from 290,000 YouTube videos, covering a wide variety of people, languages, and facial poses.
AVSpeech.torrent
Seeding 3Downloading 3Completed 2,120Total Downloads 3,731