DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection

Reconstruction-based approaches have achieved remarkable outcomes in anomalydetection. The exceptional image reconstruction capabilities of recentlypopular diffusion models have sparked research efforts to utilize them forenhanced reconstruction of anomalous images. Nonetheless, these methods mightface challenges related to the preservation of image categories and pixel-wisestructural integrity in the more practical multi-class setting. To solve theabove problems, we propose a Difusion-based Anomaly Detection (DiAD) frameworkfor multi-class anomaly detection, which consists of a pixel-space autoencoder,a latent-space Semantic-Guided (SG) network with a connection to the stablediffusion's denoising network, and a feature-space pre-trained featureextractor. Firstly, The SG network is proposed for reconstructing anomalousregions while preserving the original image's semantic information. Secondly,we introduce Spatial-aware Feature Fusion (SFF) block to maximizereconstruction accuracy when dealing with extensively reconstructed areas.Thirdly, the input and reconstructed images are processed by a pre-trainedfeature extractor to generate anomaly maps based on features extracted atdifferent scales. Experiments on MVTec-AD and VisA datasets demonstrate theeffectiveness of our approach which surpasses the state-of-the-art methods,e.g., achieving 96.8/52.6 and 97.2/99.0 (AUROC/AP) for localization anddetection respectively on multi-class MVTec-AD dataset. Code will be availableat https://lewandofskee.github.io/projects/diad.