HyperAI

DiDeMo Temporal Positioning Dataset

Date

3 years ago

Size

4.39 GB

Publish URL

github.com

License

其他

特色图像

DiDeMo stands for Distinct Describable Moments, which can be used to locate events in a video in time given a natural language description. The videos in the dataset are collected from Flickr, and each video is edited into segments of up to 30 seconds. The videos in the dataset are divided into segments of 5 seconds each to reduce the complexity of annotation.

The dataset is divided into training, validation and test sets, which contain 8,395, 1,065 and 1,004 videos respectively. The dataset contains a total of 26,892 moments, and a moment may be associated with descriptions from multiple annotators. The descriptions in the DiDeMo dataset are detailed and include camera movements, time transition indicators and activities. In addition, the descriptions in the dataset are verified, so each description refers to a single moment.

DiDeMo.torrent
Seeding 1Downloading 0Completed 575Total Downloads 989
  • DiDeMo/
    • README.md
      1.43 KB
    • README.txt
      2.86 KB
      • data/
        • average_flow_feats.h5
          652.28 MB
        • average_rgb_feats.h5
          2.59 GB
        • data_didemo.zip
          4.3 GB
        • models_didemo.zip
          4.39 GB