HyperAIHyperAI

Command Palette

Search for a command to run...

il y a 9 heures
PyTorch
Modèle 3D

gsplat : une bibliothèque open source pour le Gaussian Splatting

Résumé

gsplat est une bibliothèque open-source conçue pour l'entraînement et le développement des méthodes de type Gaussian Splatting. Elle propose une interface front-end avec des bindings Python compatibles avec la bibliothèque PyTorch, ainsi qu'un backend composé de kernels CUDA hautement optimisés. gsplat offre de nombreuses fonctionnalités qui améliorent l'optimisation des modèles Gaussian Splatting, notamment en accélérant la vitesse d'entraînement, en réduisant l'utilisation mémoire et en diminuant les temps de convergence. Les résultats expérimentaux montrent que gsplat permet de réduire le temps d'entraînement jusqu'à 10 % et la mémoire utilisée par un facteur 4 par rapport à l'implémentation originale de Kerbl et al. (2023). Utilisée dans plusieurs projets de recherche, gsplat est activement maintenue sur GitHub. Le code source est disponible à l'adresse https://github.com/nerfstudio-project/gsplat sous la licence Apache 2.0. Nous accueillons avec enthousiasme les contributions de la communauté open-source.

One-sentence Summary

gsplat is an open-source library for training and developing Gaussian Splatting methods, featuring a front-end with PyTorch-compatible Python bindings and a highly optimized CUDA back-end, which achieves up to 10% less training time and 4× less memory than the original Kerbl et al. (2023) implementation, and is actively maintained on GitHub while utilized in several research projects.

Key Contributions

  • gsplat provides a PyTorch-compatible Python frontend and a highly optimized CUDA backend, delivering a modular API to simplify integration, modification, and extension for Gaussian Splatting research.
  • The library incorporates optimization techniques that, in experiments, reduce training time by up to 10% and memory consumption by a factor of four relative to the original Kerbl et al. (2023) implementation.
  • Its modular design has been used in multiple research projects, and the open-source repository, which has attracted 39 contributors and over 1.6k GitHub stars, actively supports further innovation in the community.

Introduction

Gaussian Splatting has rapidly emerged as a powerful technique for high-fidelity 3D scene reconstruction and novel view synthesis, offering clear advantages over earlier NeRF-based methods in training speed, rendering efficiency, and ease of deployment on resource-limited devices. Prior open-source efforts, such as GauStudio and various PyTorch reproductions, primarily focused on reimplementing the original 3D Gaussian Splatting method with performance improvements, but lacked a modular and extensible API designed to support external modifications and ongoing research. The authors introduce gsplat, an open-source library that provides an efficient, user-friendly, and easily modifiable PyTorch-based interface for developing Gaussian Splatting models, while incorporating the latest research features and modern software engineering practices to foster community-driven innovation.

Dataset

The provided paper excerpt does not describe a dataset; it shows the initialization of a single 3D Gaussian primitive used in a rendering pipeline. No information about dataset composition, sources, subsets, splits, or processing appears in the given content.

Method

The authorspresent gsplat, an open-source library designed for the efficient training and development of Gaussian Splatting methods. The architecture consists of a PyTorch-compatible Python front-end and a back-end powered by highly optimized CUDA kernels. This design enables significant reductions in training time and memory usage compared to original implementations.

The core functionality revolves around a fully differentiable rasterization process. As shown in the figure below, the library computes the final rendered image by projecting 3D Gaussians onto the 2D image plane and compositing them.

The diagram illustrates the computational graph for a single pixel color CiC_iCi. The process begins with the raw Gaussian parameters: color cnc_ncn, opacity logit σn\sigma_nσn, mean position μn\mu_nμn, scale sns_nsn, and rotation quaternion qnq_nqn. These parameters are transformed into their active forms. The scale and rotation define the 3D covariance matrix Σn=RSSRT\Sigma_n = R S S R^TΣn=RSSRT, where the scaling matrix SSS is derived from s~n=exp(sn)\tilde{s}_n = \exp(s_n)s~n=exp(sn) and the rotation matrix RRR from the normalized quaternion q^n=qnqn\hat{q}_n = \frac{q_n}{\|q_n\|}q^n=qnqn. The 3D mean μn\mu_nμn is projected to 2D screen coordinates μn=Pμn\mu'_n = P \mu_nμn=Pμn. The corresponding 2D covariance Σn\Sigma'_nΣn is calculated using the Jacobian of the projection JnJ_nJn and the view matrix WWW, formulated as Σn=JnWΣnWTJnT\Sigma'_n = J_n W \Sigma_n W^T J_n^TΣn=JnWΣnWTJnT.

The opacity is activated via a sigmoid function σ~n=sigmoid(σn)\tilde{\sigma}_n = \text{sigmoid}(\sigma_n)σ~n=sigmoid(σn). The 2D Gaussian evaluation Gn(x)G'_n(x)Gn(x) is computed based on the projected mean and covariance. The blending weight αn\alpha_nαn is then determined, leading to the final alpha-blended color accumulation: Ci=nNcnαnj=1n1(1αj)C_i = \sum_{n \in N} c_n \alpha_n \prod_{j=1}^{n-1} (1 - \alpha_j)Ci=nNcnαnj=1n1(1αj)

The red arrows in the diagram depict the backward pass, showing how gradients from the loss L\mathcal{L}L flow back to update the parameters. Because the entire rendering pipeline is differentiable, the authors enable gradient flow not only to the Gaussian parameters G(c,Σ,μ,o)\mathcal{G}(c, \Sigma, \mu, o)G(c,Σ,μ,o) but also to camera view matrices P=[Rt]\mathcal{P} = [\mathbf{R} \mid \mathbf{t}]P=[Rt]. This capability is crucial for mitigating pose uncertainty by optimizing camera poses via gradient descent alongside the scene representation.

To support the optimization loop, the library provides a modular API for densification strategies. Users can implement algorithms such as Adaptive Density Control, Absgrad, or Markov Chain Monte Carlo methods to manage the creation and pruning of Gaussians in under-reconstructed or over-reconstructed regions. Additionally, the backend supports N-dimensional rasterization for rendering high-dimensional feature vectors and includes anti-aliasing modes to prevent artifacts at varying resolutions.

Experiment

The evaluation compares gsplat training with the original 3D Gaussian Splatting implementation on the MipNeRF360 dataset, using identical densification strategies and configuration on an A100 GPU. gsplat achieves equivalent novel-view synthesis quality while requiring less memory and substantially reducing training time. A feature comparison further validates the practical impact of library components, confirming that gsplat maintains rendering performance parity with improved efficiency.

The authors compare the training performance and efficiency of the gsplat library against the original 3D Gaussian Splatting implementation on the MipNeRF360 dataset. Results show that gsplat achieves equivalent rendering quality in terms of novel-view synthesis metrics while operating at the evaluated iteration counts. Furthermore, the proposed method demonstrates substantial improvements in efficiency by requiring significantly less memory and reducing total training time. gsplat matches the original implementation's rendering quality across standard metrics at both early and late training stages. The library consistently uses less GPU memory than the baseline method throughout the training process. Training time is notably faster for gsplat compared to the original implementation.

The authors evaluate the impact of different features provided in the gsplat library on the MipNeRF360 dataset. Results show that incorporating MCMC densification significantly reduces the number of Gaussians and memory usage while achieving the highest rendering quality and fastest training time. The ABSGRAD feature also improves rendering quality and efficiency compared to the baseline, whereas the antialiased version slightly increases resource consumption. MCMC densification yields the best rendering quality with the lowest memory footprint and fastest training speed. ABSGRAD improves all quality metrics while reducing the number of Gaussians and training time. The antialiased feature provides a marginal quality improvement but requires slightly more memory and time.

The authors evaluate different features of the gsplat library, including ABSGRAD, MCMC sampling, and antialiasing, across various scenes from the MipNeRF360 dataset. The results demonstrate that utilizing the ABSGRAD feature or increasing MCMC iterations generally improves performance compared to the standard gsplat baseline. The antialiased configuration produces results comparable to the baseline gsplat implementation. The ABSGRAD feature consistently outperforms the standard gsplat baseline across most tested scenes. Performance improves as the number of MCMC iterations increases from one million to three million. The antialiased configuration yields results very similar to the standard gsplat baseline.

The authors analyze the impact of different features provided in the gsplat library on the MipNeRF360 dataset. The results demonstrate that MCMC densification significantly enhances performance, with quality increasing as the iteration count rises. In contrast, features like antialiasing and ABSGRAD show minimal or mixed improvements compared to the base configuration. MCMC densification provides the most substantial performance gains, with results improving as iterations increase from 1 million to 3 million. The ABSGRAD feature offers slight improvements over the base gsplat model in several scenes but underperforms in others. The antialiased configuration yields performance metrics that are nearly identical to the standard gsplat baseline.

The authors analyze the impact of various features provided in the gsplat library across multiple scenes from the MipNeRF360 dataset. The results show that the ABSGRAD feature generally yields lower values than the base GSPLAT configuration, while the MCMC feature values increase as the iteration count rises. The antialiased version demonstrates performance closely comparable to the standard GSPLAT baseline. ABSGRAD consistently achieves lower metric values than the standard GSPLAT baseline across all scenes. MCMC results show a clear upward trend in values as the iteration count increases from 1 million to 3 million. The antialiased feature produces results that are very similar to the base GSPLAT configuration.

The authors evaluate the gsplat library on the MipNeRF360 dataset, comparing it to the original 3D Gaussian Splatting implementation and ablating features such as MCMC densification, ABSGRAD, and antialiasing. gsplat matches the original method's rendering quality while significantly reducing memory usage and training time. Among the features, MCMC densification delivers the most substantial benefits, lowering Gaussian count and memory footprint while improving quality as iteration count increases. ABSGRAD offers consistent quality improvements and efficiency gains, but its impact is milder and scene-dependent, while the antialiased variant performs nearly identically to the baseline.


Créer de l'IA avec l'IA

De l'idée au lancement — accélérez votre développement IA avec le co-codage IA gratuit, un environnement prêt à l'emploi et le meilleur prix pour les GPU.

Codage assisté par IA
GPU prêts à l’emploi
Tarifs les plus avantageux

HyperAI Newsletters

Abonnez-vous à nos dernières mises à jour
Nous vous enverrons les dernières mises à jour de la semaine dans votre boîte de réception à neuf heures chaque lundi matin
Propulsé par MailChimp