5 months ago

Shunian Chen Hejin Huang Yexin Liu Zihan Ye Pengcheng Chen Chenghao Zhu Michael Guan Rongsheng Wang Junying Chen Guanbin Li

Abstract

Audio-driven talking head synthesis has achieved remarkable photorealism, yet state-of-the-art (SOTA) models exhibit a critical failure: they lack generalization to the full spectrum of human diversity in ethnicity, language, and age groups. We argue that this generalization gap is a direct symptom of limitations in existing training data, which lack the necessary scale, quality, and diversity. To address this challenge, we introduce TalkVid, a new large-scale, high-quality, and diverse dataset containing 1244 hours of video from 7729 unique speakers. TalkVid is curated through a principled, multi-stage automated pipeline that rigorously filters for motion stability, aesthetic quality, and facial detail, and is validated against human judgments to ensure its reliability. Furthermore, we construct and release TalkVid-Bench, a stratified evaluation set of 500 clips meticulously balanced across key demographic and linguistic axes. Our experiments demonstrate that a model trained on TalkVid outperforms counterparts trained on previous datasets, exhibiting superior cross-dataset generalization. Crucially, our analysis on TalkVid-Bench reveals performance disparities across subgroups that are obscured by traditional aggregate metrics, underscoring its necessity for future research. Code and data can be found in https://github.com/FreedomIntelligence/TalkVid

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 months ago

Shunian Chen Hejin Huang Yexin Liu Zihan Ye Pengcheng Chen Chenghao Zhu Michael Guan Rongsheng Wang Junying Chen Guanbin Li

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

5 months ago

Shunian Chen Hejin Huang Yexin Liu Zihan Ye Pengcheng Chen Chenghao Zhu Michael Guan Rongsheng Wang Junying Chen Guanbin Li

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis

Shunian Chen Hejin Huang Yexin Liu Zihan Ye Pengcheng Chen Chenghao Zhu Michael Guan Rongsheng Wang Junying Chen Guanbin Li3 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis

Shunian Chen Hejin Huang Yexin Liu Zihan Ye Pengcheng Chen Chenghao Zhu Michael Guan Rongsheng Wang Junying Chen Guanbin Li3 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis

Shunian Chen Hejin Huang Yexin Liu Zihan Ye Pengcheng Chen Chenghao Zhu Michael Guan Rongsheng Wang Junying Chen Guanbin Li3 more

Abstract

Build AI with AI

HyperAI Newsletters

Shunian Chen Hejin Huang Yexin Liu Zihan Ye Pengcheng Chen Chenghao Zhu Michael Guan Rongsheng Wang Junying Chen Guanbin Li

Shunian Chen Hejin Huang Yexin Liu Zihan Ye Pengcheng Chen Chenghao Zhu Michael Guan Rongsheng Wang Junying Chen Guanbin Li

Shunian Chen Hejin Huang Yexin Liu Zihan Ye Pengcheng Chen Chenghao Zhu Michael Guan Rongsheng Wang Junying Chen Guanbin Li