Quasi-Dense Similarity Learning for Multiple Object Tracking

Similarity learning has been recognized as a crucial step for objecttracking. However, existing multiple object tracking methods only use sparseground truth matching as the training objective, while ignoring the majority ofthe informative regions on the images. In this paper, we present Quasi-DenseSimilarity Learning, which densely samples hundreds of region proposals on apair of images for contrastive learning. We can directly combine thissimilarity learning with existing detection methods to build Quasi-DenseTracking (QDTrack) without turning to displacement regression or motion priors.We also find that the resulting distinctive feature space admits a simplenearest neighbor search at the inference time. Despite its simplicity, QDTrackoutperforms all existing methods on MOT, BDD100K, Waymo, and TAO trackingbenchmarks. It achieves 68.7 MOTA at 20.3 FPS on MOT17 without using externaltraining data. Compared to methods with similar detectors, it boosts almost 10points of MOTA and significantly decreases the number of ID switches on BDD100Kand Waymo datasets. Our code and trained models are available athttp://vis.xyz/pub/qdtrack.