Argoverse-HD Streaming Object Detection Dataset
Date
Size
Publish URL
License
其他
Categories

Argoverse-HD is a dataset about streaming object detection, including real-time object detection, video object detection, tracking, and short-term prediction. The dataset contains video data from Argoverse 1.1, with a total of 70,000 image frames and 1.3 million bounding boxes. These videos have MS COCO-style annotations and track IDs with a resolution of 1920 x 1200 @ 30 FPS. These annotations are backward compatible with COCO, so researchers can directly evaluate COCO pre-trained models on this dataset, and then estimate the efficiency of the model or its ability to generalize across datasets.
Argoverse-HD is a dataset for the Stream Perception Challenge, which includes two tracks:
- Single Detection (Real-time Object Detection): In this track, participants will develop single-frame object detectors, similar to the COCO and LVIS challenges. The key difference is that the evaluation will be scored over latency via streaming accuracy.
- Full Stack: In this track, the methods are not restricted. However, most likely, tracking and prediction will be used to compensate for the latency of the detector.
By default, all submissions have their latencies measured by the official V100 GPU toolkit.