Text-Aware Image Restoration with Diffusion Models

Jaewon Min, Jin Hyeon Kim, Paul Hyunbin Cho, Jaeeun Lee, Jihye Park, Minkyu Park, Sangpil Kim, Hyunhee Park, Seungryong Kim

Date de publication: 6/15/2025

Text-Aware Image Restoration with Diffusion Models

Résumé

Image restoration aims to recover degraded images. However, existingdiffusion-based restoration methods, despite great success in natural imagerestoration, often struggle to faithfully reconstruct textual regions indegraded images. Those methods frequently generate plausible but incorrecttext-like patterns, a phenomenon we refer to as text-image hallucination. Inthis paper, we introduce Text-Aware Image Restoration (TAIR), a novelrestoration task that requires the simultaneous recovery of visual contents andtextual fidelity. To tackle this task, we present SA-Text, a large-scalebenchmark of 100K high-quality scene images densely annotated with diverse andcomplex text instances. Furthermore, we propose a multi-task diffusionframework, called TeReDiff, that integrates internal features from diffusionmodels into a text-spotting module, enabling both components to benefit fromjoint training. This allows for the extraction of rich text representations,which are utilized as prompts in subsequent denoising steps. Extensiveexperiments demonstrate that our approach consistently outperformsstate-of-the-art restoration methods, achieving significant gains in textrecognition accuracy. See our project page: https://cvlab-kaist.github.io/TAIR/

Voir les détails de l'article View Code