HyperAI

VizWiz Visual Question Answering Dataset for the Blind

Date

3 years ago

Size

17.65 GB

Organization

University of Texas at Austin

Publish URL

vizwiz.org

License

CC BY 4.0

特色图像

VizWiz-VQA (Visual Question Answering) is an image dataset for visual question answering for the blind. Blind users use the VizWiz software to take a photo and record a verbal question about the photo and 10 crowdsourced answers to the question. This dataset is used to solve the following two problems: one is to predict the answer to a visual question, and the other is to determine whether a visual question can be answered. This dataset aims to study more general algorithms to help blind people solve life obstacles.

The dataset includes (2020 latest version):

  • 20,523 pairs of training images/questions
  • 205,230 for training answers/answer confidence
  • 4319 Verification images/questions
  • 43,190 Verification answers/answer confidence
  • 8,000 pairs of test images/questions
VisWiz.torrent
Seeding 2Downloading 1Completed 106Total Downloads 202
  • VisWiz/
    • README.md
      1.41 KB
    • README.txt
      2.82 KB
      • data/
        • API.zip
          176.98 MB
        • Annotations.zip
          178.55 MB
        • test.zip
          3.88 GB
        • train.zip
          14.4 GB
        • val.zip
          17.65 GB