HyperAI

MMVP Multimodal Motion Capture Dataset

Date

a year ago

Size

3 MB

Organization

Beijing University of Aeronautics and Astronautics
Tsinghua University

Publish URL

hf-mirror.com

特色图像

MMVP (Multimodal MoCap Dataset with Vision and Pressure Sensors) is a multimodal motion capture dataset combining vision and pressure sensors jointly developed by Beihang University, Tsinghua University and Nanjing University.

The dataset includes a wide range of rapid human movements, such as running, skipping, and standing long jump. A total of more than 44k frames of synchronized RGBD frames and pressure data from 16 subjects were collected. The researchers used the Azure Kinect camera to record RGBD video at a frequency of 30 frames per second, and used Xsensor pressure insoles to capture plantar pressure data at a rate of up to 150 frames per second. The two data streams were manually synchronized, and combined with deep learning algorithms such as FPP-Net and CLIFF to achieve detailed processing and analysis of the data. This dataset provides a new data resource for human motion capture research based on vision and pressure sensors, which can promote progress in this field.

describe: The MMVP (Multimodal Visual Mode) benchmark focuses on identifying "CLIP-blind pairs" - images that CLIP considers similar despite having obvious visual differences. MMVP benchmarks the performance of state-of-the-art systems, including GPT-4V, on nine basic visual modes. It highlights the challenges these systems face in answering simple questions, often leading to incorrect responses and hallucinatory interpretations.

  • Content type: Images (CLIP-blind pairs)
  • quantity: 300 images
  • Data source: Derived from ImageNet-1k and LAION-Aesthetics
  • Data Collection Methods: Identification of CLIP blind pairs by comparative analysis
MMVP.torrent
Seeding 1Downloading 1Completed 122Total Downloads 101
  • MMVP/
    • README.md
      2.15 KB
    • README.txt
      4.29 KB
      • data/
        • MMVP.zip
          3 MB