Command Palette
Search for a command to run...
画像認識 — 性別検出
概要
One-sentence Summary
The authors propose an unsupervised lesion detection framework that reformulates the task as an image restoration problem, employing a probabilistic model with a network-based normative prior to detect lesions pixel-wise via maximum a posteriori estimation.
Key Contributions
- Reframes unsupervised lesion detection as an image restoration problem that estimates normative anatomical distributions exclusively from healthy imaging samples without requiring lesion annotations.
- Introduces a probabilistic framework utilizing network-parameterized priors to model healthy anatomy and identifies lesions pixel-wise through Maximum A Posteriori (MAP) estimation during reconstruction.
- Implements and evaluates the restoration pipeline using Variational Autoencoder and Gaussian Mixture VAE architectures trained on healthy magnetic resonance imaging scans to isolate anatomical outliers.
Introduction
Unsupervised lesion detection in medical imaging enables automated identification of anatomical abnormalities without labeled training data, acting as a vital pre-screening tool and enhancing the robustness of downstream diagnostic algorithms. Current deep learning methods approximate healthy anatomy using latent variable models and detect lesions through a prior-projection reconstruction step. This approach frequently produces high false positive rates because it assumes healthy tissue encodings remain stable despite lesion-driven intensity variations. The authors reframe unsupervised detection as an image restoration problem and leverage a probabilistic Maximum-A-Posteriori estimation framework to solve it. By combining a network-based normative prior with a strict penalty for large restoration deviations, their approach significantly suppresses false positives and achieves state-of-the-art performance on glioma and stroke MRI datasets.
Dataset
The authors utilize three public MRI datasets to train normative prior models and evaluate lesion detection performance:
- CamCAN (Training and Validation): Sourced from the Cambridge Centre for Ageing and Neuroscience, this cohort includes T1 and T2 weighted scans from 652 healthy adults aged 18 to 87. The authors split the data into 600 subjects for training VAE and GMVAE prior models, and 52 subjects for validating hyperparameters.
- BRATS17 (Evaluation): Downloaded from the 2017 Multimodal Brain Tumor Image Segmentation Challenge, this subset contains T2 weighted scans from 285 patients with brain tumors, comprising 210 high grade glioblastomas and 75 low grade gliomas. Only T2 images are used since lesions appear as hyperintense regions, and ground truth tumor segmentations are provided.
- ATLAS (Evaluation): Sourced from the Anatomical Tracings of Lesions After Stroke project, this subset comprises 220 T1 weighted scans from stroke patients. Lesions appear as hypointense regions, and pixel wise ground truth masks are available.
- Training Strategy and Data Alignment: Two separate prior models are trained on CamCAN modalities to align with the target evaluation datasets. The T2 model trains on CamCAN T2 data and evaluates on BRATS17, while the T1 model trains on CamCAN T1 data and evaluates on ATLAS.
- Preprocessing and Normalization: All scans undergo skull stripping and MNI space registration. The authors apply histogram matching across CamCAN subjects, followed by pixel intensity normalization using a fixed reference subject. Background pixels are set to -3.5. To reduce domain gaps, BRATS17 and ATLAS scans are histogram matched to their respective CamCAN reference scans after within dataset normalization.
- Cropping and 2D Conversion: Computations rely exclusively on 2D transversal slices. Edge slices lacking brain tissue and excessive background regions are removed. All remaining slices are standardized to a 200 by 200 resolution. Inference runs independently on each slice, with final evaluation metrics aggregated at the subject level.
Method
The authors leverage a probabilistic framework for unsupervised lesion detection based on maximum a posteriori (MAP) estimation within a latent variable model architecture. The overall method operates in two stages: first, learning a normative prior distribution of healthy anatomical images using either a Variational Autoencoder (VAE) or a Gaussian Mixture VAE (GMVAE), and second, applying MAP-based restoration to detect lesions in new images. For the first stage, the model is trained exclusively on healthy subjects to learn the distribution of normal brain anatomy. The architecture consists of an encoder and a decoder, with the encoder mapping input images to a latent space and the decoder reconstructing the image from the latent representation. In the VAE variant, the encoder outputs parameters of a Gaussian distribution over the latent variables, while the GMVAE extends this by modeling the latent space as a mixture of Gaussians. The training process minimizes the evidence lower bound (ELBO) to approximate the log-likelihood of the data under the learned model. 
For lesion detection, the method treats an image with a lesion as a corrupted version of a healthy image, where the lesion corresponds to an additive outlier component. The goal is to recover the underlying healthy image by solving a MAP estimation problem that balances data consistency with the normative prior. The posterior distribution P(X∣Y) is maximized, where X is the lesion-free image and Y is the observed image with a lesion. This optimization is approximated using the ELBO, which provides a tractable objective for gradient-based optimization. The data consistency term, P(Y∣X), is modeled using the Total Variation (TV) norm to penalize deviations between X and Y, favoring smooth, continuous lesion-like structures over isolated pixel changes. The restoration process involves iterative gradient ascent on the combined objective, starting from the input image Y, to obtain the restored image X^ and the estimated lesion D^=Y−X^. 
The architecture of the encoder and decoder networks is based on a residual block design. Each block consists of a down-sampling convolutional layer, followed by batch normalization and a convolutional layer, with a skip connection that adds the input to the output. This structure is used in both the VAE and GMVAE models, with the GMVAE additionally incorporating a mixture of Gaussian components in the latent space. The latent variable dimension is set to 2×2×512 for both models to ensure a fair comparison. The down-sampling is performed with a stride of 2, and up-sampling is achieved via bilinear interpolation. The decoder network mirrors the encoder structure. Leaky ReLU activation is used in all hidden layers, while identity activation is applied to the output layers and those connecting to the latent variables. 
Experiment
The evaluation setup involved testing the proposed unsupervised lesion detection framework across multiple brain MRI datasets while systematically examining hyperparameter sensitivity, three-dimensional consistency, and iterative restoration dynamics. These experiments collectively validate that the model reliably converges during optimization, maintains spatial coherence across volume slices, and exhibits robust performance across varying data consistency weights and mixture configurations. Qualitatively, detection accuracy is strongly influenced by lesion characteristics, with the method performing reliably on larger abnormalities but showing reduced sensitivity to smaller or subtle pathologies. Overall, the findings confirm the framework's effectiveness and parameter stability, while acknowledging inherent limitations in handling faint lesions and extensive structural deformations.
The authors compare different models for lesion detection, focusing on the performance of GMVAE-based methods with varying hyperparameters. Results show that the proposed GMVAE models achieve higher detection accuracy compared to other methods, with improvements in both AUC and Dice scores. The analysis indicates that the model performance is robust to changes in certain hyperparameters, particularly the number of Gaussian mixtures and latent space dimensions. GMVAE-based methods outperform other models in detection accuracy, as measured by AUC and Dice scores. The performance of GMVAE models is relatively stable across different hyperparameter settings, indicating robustness. The proposed method achieves higher detection accuracy compared to supervised baselines, especially in terms of AUC and Dice scores.
The authors compare their proposed method, GMVAE(TV), with other models on lesion detection tasks, showing that their approach achieves higher AUC and Dice scores compared to baseline methods. The performance of GMVAE(TV) is relatively stable across different numbers of Gaussian mixtures and latent dimensions, with minor variations in results. The model demonstrates consistent detection across 3D slices and converges stably during iterative restoration. The proposed GMVAE(TV) method outperforms other models in terms of AUC and Dice scores on the tested datasets. The performance of GMVAE(TV) is robust to changes in the number of Gaussian mixtures and latent dimensions, showing minimal variation. The model exhibits stable convergence during iterative image restoration and consistent detection across 3D slices.
The authors analyze the impact of model hyperparameters on detection performance using a GMVAE prior, focusing on the number of Gaussian mixtures and latent space dimension. Results show that while AUC values are relatively stable across different configurations, DSC scores vary more significantly, with certain combinations achieving higher performance. The model demonstrates stable convergence during iterative image restoration. AUC values are relatively consistent across different hyperparameter settings, indicating robustness to parameter selection. DSC scores vary more with changes in hyperparameters, with some configurations achieving notably higher performance. The model exhibits stable convergence during iterative image restoration, with AUC values stabilizing after a certain number of optimization steps.
The experiments evaluate a proposed GMVAE-based framework against supervised baselines for medical lesion detection, primarily validating its superior detection accuracy and general effectiveness. Subsequent analyses investigate the model's sensitivity to architectural choices, confirming that performance remains robust across varying configurations of Gaussian mixtures and latent space dimensions. Finally, the study assesses the algorithm's computational behavior, demonstrating stable convergence during iterative image restoration and consistent lesion identification across three-dimensional slices.