HyperAIHyperAI

Command Palette

Search for a command to run...

CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior

Xing Jinbo ; Xia Menghan ; Zhang Yuechen ; Cun Xiaodong ; Wang Jue ; Wong Tien-Tsin

Abstract

Speech-driven 3D facial animation has been widely studied, yet there is stilla gap to achieving realism and vividness due to the highly ill-posed nature andscarcity of audio-visual data. Existing works typically formulate thecross-modal mapping into a regression task, which suffers from theregression-to-mean problem leading to over-smoothed facial motions. In thispaper, we propose to cast speech-driven facial animation as a code query taskin a finite proxy space of the learned codebook, which effectively promotes thevividness of the generated motions by reducing the cross-modal mappinguncertainty. The codebook is learned by self-reconstruction over real facialmotions and thus embedded with realistic facial motion priors. Over thediscrete motion space, a temporal autoregressive model is employed tosequentially synthesize facial motions from the input speech signal, whichguarantees lip-sync as well as plausible facial expressions. We demonstratethat our approach outperforms current state-of-the-art methods bothqualitatively and quantitatively. Also, a user study further justifies oursuperiority in perceptual quality.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp