HyperAIHyperAI

Command Palette

Search for a command to run...

LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation

Guangcong Zheng extsuperscript1,* Xianpan Zhou extsuperscript2,* Xuewei Li extsuperscript1,† Zhongang Qi extsuperscript3 Ying Shan extsuperscript3 Xi Li extsuperscript1,4,5,6,†

Abstract

Recently, diffusion models have achieved great success in image synthesis.However, when it comes to the layout-to-image generation where an image oftenhas a complex scene of multiple objects, how to make strong control over boththe global layout map and each detailed object remains a challenging task. Inthis paper, we propose a diffusion model named LayoutDiffusion that can obtainhigher generation quality and greater controllability than the previous works.To overcome the difficult multimodal fusion of image and layout, we propose toconstruct a structural image patch with region information and transform thepatched image into a special layout to fuse with the normal layout in a unifiedform. Moreover, Layout Fusion Module (LFM) and Object-aware Cross Attention(OaCA) are proposed to model the relationship among multiple objects anddesigned to be object-aware and position-sensitive, allowing for preciselycontrolling the spatial related information. Extensive experiments show thatour LayoutDiffusion outperforms the previous SOTA methods on FID, CAS byrelatively 46.35%, 26.70% on COCO-stuff and 44.29%, 41.82% on VG. Code isavailable at https://github.com/ZGCTroy/LayoutDiffusion.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp