MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

Multi-camera tracking systems are gaining popularity in applications thatdemand high-quality tracking results, such as frictionless checkout becausemonocular multi-object tracking (MOT) systems often fail in cluttered andcrowded environments due to occlusion. Multiple highly overlapped cameras cansignificantly alleviate the problem by recovering partial 3D information.However, the cost of creating a high-quality multi-camera tracking dataset withdiverse camera settings and backgrounds has limited the dataset scale in thisdomain. In this paper, we provide a large-scale densely-labeled multi-cameratracking dataset in five different environments with the help of anauto-annotation system. The system uses overlapped and calibrated depth and RGBcameras to build a high-performance 3D tracker that automatically generates the3D tracking results. The 3D tracking results are projected to each RGB cameraview using camera parameters to create 2D tracking results. Then, we manuallycheck and correct the 3D tracking results to ensure the label quality, which ismuch cheaper than fully manual annotation. We have conducted extensiveexperiments using two real-time multi-camera trackers and a personre-identification (ReID) model with different settings. This dataset provides amore reliable benchmark of multi-camera, multi-object tracking systems incluttered and crowded environments. Also, our results demonstrate that adaptingthe trackers and ReID models on this dataset significantly improves theirperformance. Our dataset will be publicly released upon the acceptance of thiswork.