HyperAI
a month ago

The Diffusion Duality

Sahoo, Subham Sekhar ; Deschenaux, Justin ; Gokaslan, Aaron ; Wang, Guanghan ; Chiu, Justin ; Kuleshov, Volodymyr
The Diffusion Duality
Abstract

Uniform-state discrete diffusion models hold the promise of fast textgeneration due to their inherent ability to self-correct. However, they aretypically outperformed by autoregressive models and masked diffusion models. Inthis work, we narrow this performance gap by leveraging a key insight:Uniform-state diffusion processes naturally emerge from an underlying Gaussiandiffusion. Our method, Duo, transfers powerful techniques from Gaussiandiffusion to improve both training and sampling. First, we introduce acurriculum learning strategy guided by the Gaussian process, doubling trainingspeed by reducing variance. Models trained with curriculum learning surpassautoregressive models in zero-shot perplexity on 3 of 7 benchmarks. Second, wepresent Discrete Consistency Distillation, which adapts consistencydistillation from the continuous to the discrete setting. This algorithmunlocks few-step generation in diffusion language models by acceleratingsampling by two orders of magnitude. We provide the code and model checkpointson the project page: http://s-sahoo.github.io/duo