HyperAIHyperAI

Command Palette

Search for a command to run...

Adding Conditional Control to Text-to-Image Diffusion Models

Lvmin Zhang Anyi Rao Maneesh Agrawala

Abstract

We present ControlNet, a neural network architecture to add spatialconditioning controls to large, pretrained text-to-image diffusion models.ControlNet locks the production-ready large diffusion models, and reuses theirdeep and robust encoding layers pretrained with billions of images as a strongbackbone to learn a diverse set of conditional controls. The neuralarchitecture is connected with "zero convolutions" (zero-initializedconvolution layers) that progressively grow the parameters from zero and ensurethat no harmful noise could affect the finetuning. We test various conditioningcontrols, eg, edges, depth, segmentation, human pose, etc, with StableDiffusion, using single or multiple conditions, with or without prompts. Weshow that the training of ControlNets is robust with small (<50k) and large(>1m) datasets. Extensive results show that ControlNet may facilitate widerapplications to control image diffusion models.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp