4 months ago

Abstract

Pixel-space generative models are often more difficult to train and generallyunderperform compared to their latent-space counterparts, leaving a persistentperformance and efficiency gap. In this paper, we introduce a novel two-stagetraining framework that closes this gap for pixel-space diffusion andconsistency models. In the first stage, we pre-train encoders to capturemeaningful semantics from clean images while aligning them with points alongthe same deterministic sampling trajectory, which evolves points from the priorto the data distribution. In the second stage, we integrate the encoder with arandomly initialized decoder and fine-tune the complete model end-to-end forboth diffusion and consistency models. Our training framework demonstratesstrong empirical performance on ImageNet dataset. Specifically, our diffusionmodel reaches an FID of 2.04 on ImageNet-256 and 2.35 on ImageNet-512 with 75number of function evaluations (NFE), surpassing prior pixel-space methods by alarge margin in both generation quality and efficiency while rivaling leadingVAE-based models at comparable training cost. Furthermore, on ImageNet-256, ourconsistency model achieves an impressive FID of 8.82 in a single sampling step,significantly surpassing its latent-space counterpart. To the best of ourknowledge, this marks the first successful training of a consistency modeldirectly on high-resolution images without relying on pre-trained VAEs ordiffusion models.

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

4 months ago

Jiachen Lei Keli Liu Julius Berner Haiming Yu Hongkai Zheng Jiahong Wu Xiangxiang Chu

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

4 months ago

Jiachen Lei Keli Liu Julius Berner Haiming Yu Hongkai Zheng Jiahong Wu Xiangxiang Chu

Abstract

Source PDF

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training

Jiachen Lei Keli Liu Julius Berner Haiming Yu Hongkai Zheng Jiahong Wu Xiangxiang Chu

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training

Jiachen Lei Keli Liu Julius Berner Haiming Yu Hongkai Zheng Jiahong Wu Xiangxiang Chu

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training

Jiachen Lei Keli Liu Julius Berner Haiming Yu Hongkai Zheng Jiahong Wu Xiangxiang Chu

Abstract

Build AI with AI

HyperAI Newsletters