HyperAIHyperAI

COREVQA Visual Question Answering Benchmark Dataset

Date

a month ago

Size

5.63 GB

Publish URL

www.kaggle.com

Paper URL

2507.13405

License

Apache 2.0

COREVQA is a visual question answering benchmark dataset released by the Algoverse Artificial Intelligence Research Center in 2025. The related paper results are "COREVQA: A Crowd Observation and Reasoning Entailment Visual Question Answering Benchmark", which aims to evaluate the reasoning entailment ability of visual language models (VLMs) in crowd scenes.

This dataset contains 5,608 pairs of images and true/false sentences. The images are derived from the CrowdHuman dataset. The data primarily depicts real-world crowded scenes, emphasizing challenges such as occlusion, perspective changes, and background interference. It aims to advance the fine-grained perception and reasoning capabilities of VLMs in complex social scenarios.

The data includes:

  • Scene image (image_id)
  • Natural language statement (question)
  • Binary label (answer:TRUE / FALSE)

COREVQA.torrent
Seeding 1Downloading 0Completed 5Total Downloads 32
  • COREVQA/
    • README.md
      1.42 KB
    • README.txt
      2.85 KB
      • data/
        • COREVQA.zip
          5.63 GB