PointGPT: Auto-regressively Generative Pre-training from Point Clouds

Large language models (LLMs) based on the generative pre-training transformer(GPT) have demonstrated remarkable effectiveness across a diverse range ofdownstream tasks. Inspired by the advancements of the GPT, we present PointGPT,a novel approach that extends the concept of GPT to point clouds, addressingthe challenges associated with disorder properties, low information density,and task gaps. Specifically, a point cloud auto-regressive generation task isproposed to pre-train transformer models. Our method partitions the input pointcloud into multiple point patches and arranges them in an ordered sequencebased on their spatial proximity. Then, an extractor-generator basedtransformer decoder, with a dual masking strategy, learns latentrepresentations conditioned on the preceding point patches, aiming to predictthe next one in an auto-regressive manner. Our scalable approach allows forlearning high-capacity models that generalize well, achieving state-of-the-artperformance on various downstream tasks. In particular, our approach achievesclassification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on theScanObjectNN dataset, outperforming all other transformer models. Furthermore,our method also attains new state-of-the-art accuracies on all four few-shotlearning benchmarks.