2 months ago

Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation

Cho, Jaemin ; Li, Linjie ; Yang, Zhengyuan ; Gan, Zhe ; Wang, Lijuan ; Bansal, Mohit

Abstract

Spatial control is a core capability in controllable image generation.Advancements in layout-guided image generation have shown promising results onin-distribution (ID) datasets with similar spatial configurations. However, itis unclear how these models perform when facing out-of-distribution (OOD)samples with arbitrary, unseen layouts. In this paper, we propose LayoutBench,a diagnostic benchmark for layout-guided image generation that examines fourcategories of spatial control skills: number, position, size, and shape. Webenchmark two recent representative layout-guided image generation methods andobserve that the good ID layout control may not generalize well to arbitrarylayouts in the wild (e.g., objects at the boundary). Next, we proposeIterInpaint, a new baseline that generates foreground and background regionsstep-by-step via inpainting, demonstrating stronger generalizability thanexisting models on OOD layouts in LayoutBench. We perform quantitative andqualitative evaluation and fine-grained analysis on the four LayoutBench skillsto pinpoint the weaknesses of existing models. We show comprehensive ablationstudies on IterInpaint, including training task ratio, crop&paste vs. repaint,and generation order. Lastly, we evaluate the zero-shot performance ofdifferent pretrained layout-guided image generation models on LayoutBench-COCO,our new benchmark for OOD layouts with real objects, where our IterInpaintconsistently outperforms SOTA baselines in all four splits. Project website:https://layoutbench.github.io