Cross-view image synthesis using geometry-guided conditional GANs

We address the problem of generating images across two drastically differentviews, namely ground (street) and aerial (overhead) views. Image synthesis byitself is a very challenging computer vision task and is even more so whengeneration is conditioned on an image in another view. Due the difference inviewpoints, there is small overlapping field of view and little common contentbetween these two views. Here, we try to preserve the pixel information betweenthe views so that the generated image is a realistic representation of crossview input image. For this, we propose to use homography as a guide to map theimages between the views based on the common field of view to preserve thedetails in the input image. We then use generative adversarial networks toinpaint the missing regions in the transformed image and add realism to it. Ourexhaustive evaluation and model comparison demonstrate that utilizing geometryconstraints adds fine details to the generated images and can be a betterapproach for cross view image synthesis than purely pixel based synthesismethods.