HyperAI

EgoThink: A First-person Visual Question Answering Benchmark Dataset

Date

a year ago

Size

865.29 MB

Organization

Tsinghua University

Publish URL

hf-mirror.com

特色图像

EgoThink is a first-person perspective visual question answering benchmark dataset proposed by Tsinghua University.The dataset contains 700 images covering 6 core capabilities broken down into 12 dimensions. EgoThink's images come from the sampled images of the Ego4D first-person video dataset. In order to ensure data diversity, only 2 images are sampled for each video at most.

During the dataset construction process, only high-quality images that can clearly show first-person perspective thinking were selected. The dataset is manually annotated, and each dimension contains at least 50 detailed annotated question-answering questions, which are derived from multiple real-life scenarios from the first-person perspective. EgoThink has a wide range of applications, especially in evaluating and improving the performance of VLMs in first-person perspective tasks, providing a valuable resource for future embodied artificial intelligence and robotics research.

EgoThink.torrent
Seeding 1Downloading 1Completed 66Total Downloads 48
  • EgoThink/
    • README.md
      1.56 KB
    • README.txt
      3.12 KB
      • data/
        • EgoThink.zip
          865.29 MB