FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras

Driving requires interacting with road agents and predicting their futurebehaviour in order to navigate safely. We present FIERY: a probabilistic futureprediction model in bird's-eye view from monocular cameras. Our model predictsfuture instance segmentation and motion of dynamic agents that can betransformed into non-parametric future trajectories. Our approach combines theperception, sensor fusion and prediction components of a traditional autonomousdriving stack by estimating bird's-eye-view prediction directly from surroundRGB monocular camera inputs. FIERY learns to model the inherent stochasticnature of the future solely from camera driving data in an end-to-end manner,without relying on HD maps, and predicts multimodal future trajectories. Weshow that our model outperforms previous prediction baselines on the NuScenesand Lyft datasets. The code and trained models are available athttps://github.com/wayveai/fiery.