HyperAIHyperAI
2 months ago

ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning

Hu, Shengchao ; Chen, Li ; Wu, Penghao ; Li, Hongyang ; Yan, Junchi ; Tao, Dacheng
ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal
  Feature Learning
Abstract

Many existing autonomous driving paradigms involve a multi-stage discretepipeline of tasks. To better predict the control signals and enhance usersafety, an end-to-end approach that benefits from joint spatial-temporalfeature learning is desirable. While there are some pioneering works onLiDAR-based input or implicit design, in this paper we formulate the problem inan interpretable vision-based setting. In particular, we propose aspatial-temporal feature learning scheme towards a set of more representativefeatures for perception, prediction and planning tasks simultaneously, which iscalled ST-P3. Specifically, an egocentric-aligned accumulation technique isproposed to preserve geometry information in 3D space before the bird's eyeview transformation for perception; a dual pathway modeling is devised to takepast motion variations into account for future prediction; a temporal-basedrefinement unit is introduced to compensate for recognizing vision-basedelements for planning. To the best of our knowledge, we are the first tosystematically investigate each part of an interpretable end-to-endvision-based autonomous driving system. We benchmark our approach againstprevious state-of-the-arts on both open-loop nuScenes dataset as well asclosed-loop CARLA simulation. The results show the effectiveness of our method.Source code, model and protocol details are made publicly available athttps://github.com/OpenPerceptionX/ST-P3.

ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning | Latest Papers | HyperAI