HyperAIHyperAI

Command Palette

Search for a command to run...

DisCo-Diff: Enhancing Continuous Diffusion Models with Discrete Latents

Yilun Xu Gabriele Corso Tommi Jaakkola Arash Vahdat Karsten Kreis

Abstract

Diffusion models (DMs) have revolutionized generative learning. They utilizea diffusion process to encode data into a simple Gaussian distribution.However, encoding a complex, potentially multimodal data distribution into asingle continuous Gaussian distribution arguably represents an unnecessarilychallenging learning problem. We propose Discrete-Continuous Latent VariableDiffusion Models (DisCo-Diff) to simplify this task by introducingcomplementary discrete latent variables. We augment DMs with learnable discretelatents, inferred with an encoder, and train DM and encoder end-to-end.DisCo-Diff does not rely on pre-trained networks, making the frameworkuniversally applicable. The discrete latents significantly simplify learningthe DM's complex noise-to-data mapping by reducing the curvature of the DM'sgenerative ODE. An additional autoregressive transformer models thedistribution of the discrete latents, a simple step because DisCo-Diff requiresonly few discrete variables with small codebooks. We validate DisCo-Diff on toydata, several image synthesis tasks as well as molecular docking, and find thatintroducing discrete latents consistently improves model performance. Forexample, DisCo-Diff achieves state-of-the-art FID scores on class-conditionedImageNet-64/128 datasets with ODE sampler.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp