Calligrapher: Freestyle Text Image Customization

We introduce Calligrapher, a novel diffusion-based framework thatinnovatively integrates advanced text customization with artistic typographyfor digital calligraphy and design applications. Addressing the challenges ofprecise style control and data dependency in typographic customization, ourframework incorporates three key technical contributions. First, we develop aself-distillation mechanism that leverages the pre-trained text-to-imagegenerative model itself alongside the large language model to automaticallyconstruct a style-centric typography benchmark. Second, we introduce alocalized style injection framework via a trainable style encoder, whichcomprises both Qformer and linear layers, to extract robust style features fromreference images. An in-context generation mechanism is also employed todirectly embed reference images into the denoising process, further enhancingthe refined alignment of target styles. Extensive quantitative and qualitativeevaluations across diverse fonts and design contexts confirm Calligrapher'saccurate reproduction of intricate stylistic details and precise glyphpositioning. By automating high-quality, visually consistent typography,Calligrapher surpasses traditional models, empowering creative practitioners indigital art, branding, and contextual typographic design.