21 days ago

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Qinglin Zhu Yizhen Yao Runcong Zhao Yanzheng Xiang Amrutha Saseendran Chen Jin Philip Alexander Teare Bin Liang Yulan He Lin Gui

Abstract

Autoregressive (AR) models remain the standard for natural languagegeneration but still suffer from high latency due to strictly sequentialdecoding. Recent diffusion-inspired approaches, such as LlaDA and Dream,mitigate this by generating in parallel, yet they suffer from two corelimitations: information loss, as predictive distributions for non-finalizedtokens are discarded at each step, and premature commitment, where localdecisions are made without sufficient global coordination. We introduce LatentRefinement Decoding (LRD), a two-stage framework with Latent Refinement and aPredictive Feedback Loop. The first stage maintains masked positions asdistributional mixtures of predicted tokens and the mask embedding, allowingthe model to establish more globally consistent beliefs. The second stageprogressively finalizes confident tokens while retaining uncertain ones foriterative feedback. KL-divergence dynamics provide a principled and reliablecriterion for convergence and early stopping. Experiments across coding(HumanEval +6.3, MBPP +2.6) and reasoning (GSM8K +2.9, MATH500 +3.8) show thatLRD improves accuracy while delivering speedups of up to 10.6x, making it astrong and versatile alternative for parallel sequence generation.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Qinglin Zhu Yizhen Yao Runcong Zhao Yanzheng Xiang Amrutha Saseendran Chen Jin Philip Alexander Teare Bin Liang Yulan He Lin Gui

Abstract

Build AI with AI

Hyper Newsletters