View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions

In this paper we present a novel unsupervised representation learningapproach for 3D shapes, which is an important research challenge as it avoidsthe manual effort required for collecting supervised data. Our method trains anRNN-based neural network architecture to solve multiple view inter-predictiontasks for each shape. Given several nearby views of a shape, we define viewinter-prediction as the task of predicting the center view between the inputviews, and reconstructing the input views in a low-level feature space. The keyidea of our approach is to implement the shape representation as ashape-specific global memory that is shared between all local viewinter-predictions for each shape. Intuitively, this memory enables the systemto aggregate information that is useful to better solve the viewinter-prediction tasks for each shape, and to leverage the memory as aview-independent shape representation. Our approach obtains the best resultsusing a combination of L_2 and adversarial losses for the view inter-predictiontask. We show that VIP-GAN outperforms state-of-the-art methods in unsupervised3D feature learning on three large scale 3D shape benchmarks.