HyperAIHyperAI
2 months ago

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

Zhou, Dewei ; Li, You ; Ma, Fan ; Zhang, Xiaoting ; Yang, Yi
MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis
Abstract

We present a Multi-Instance Generation (MIG) task, simultaneously generatingmultiple instances with diverse controls in one image. Given a set ofpredefined coordinates and their corresponding descriptions, the task is toensure that generated instances are accurately at the designated locations andthat all instances' attributes adhere to their corresponding description. Thisbroadens the scope of current research on Single-instance generation, elevatingit to a more versatile and practical dimension. Inspired by the idea of divideand conquer, we introduce an innovative approach named Multi-InstanceGeneration Controller (MIGC) to address the challenges of the MIG task.Initially, we break down the MIG task into several subtasks, each involving theshading of a single instance. To ensure precise shading for each instance, weintroduce an instance enhancement attention mechanism. Lastly, we aggregate allthe shaded instances to provide the necessary information for accuratelygenerating multiple instances in stable diffusion (SD). To evaluate how wellgeneration models perform on the MIG task, we provide a COCO-MIG benchmarkalong with an evaluation pipeline. Extensive experiments were conducted on theproposed COCO-MIG benchmark, as well as on various commonly used benchmarks.The evaluation results illustrate the exceptional control capabilities of ourmodel in terms of quantity, position, attribute, and interaction. Code anddemos will be released at https://migcproject.github.io/.