DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Human motion, inherently continuous and dynamic, presents significantchallenges for generative models. Despite their dominance, discretequantization methods, such as VQ-VAEs, suffer from inherent limitations,including restricted expressiveness and frame-wise noise artifacts. Continuousapproaches, while producing smoother and more natural motions, often falter dueto high-dimensional complexity and limited training data. To resolve this"discord" between discrete and continuous representations, we introduceDisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding, anovel method that decodes discrete motion tokens into continuous motion throughrectified flow. By employing an iterative refinement process in the continuousspace, DisCoRD captures fine-grained dynamics and ensures smoother and morenatural motions. Compatible with any discrete-based framework, our methodenhances naturalness without compromising faithfulness to the conditioningsignals. Extensive evaluations demonstrate that DisCoRD achievesstate-of-the-art performance, with FID of 0.032 on HumanML3D and 0.169 onKIT-ML. These results solidify DisCoRD as a robust solution for bridging thedivide between discrete efficiency and continuous realism. Our project page isavailable at: https://whwjdqls.github.io/discord.github.io/.