HyperAI
Back to Headlines

New AI tool slashes data needs for medical image analysis, enabling accurate diagnosis with minimal training samples

5 days ago

A new artificial intelligence tool called GenSeg is revolutionizing medical image segmentation by significantly reducing the amount of labeled data needed to train accurate models. Developed by researchers at the University of California San Diego, including Ph.D. student Li Zhang and electrical and computer engineering professor Pengtao Xie, the tool enables high-performance segmentation even when only a small number of expert-annotated images are available. Medical image segmentation involves labeling every pixel in an image to identify structures like tumors, organs, or lesions. This task is traditionally done by highly trained radiologists or specialists, making it time-consuming and expensive. Deep learning models have shown promise in automating this process, but they typically require large datasets of pixel-level annotations—data that are often unavailable for rare diseases or less common imaging procedures. GenSeg overcomes this challenge by using a novel end-to-end framework that generates synthetic image-mask pairs to augment limited real-world data. Instead of relying solely on existing datasets, the system learns to create realistic synthetic images based on segmentation masks, then uses those synthetic examples to train the segmentation model. Crucially, the process is iterative: the model’s performance directly guides the generation of new synthetic data, ensuring the artificial images are not just visually plausible but also useful for improving accuracy. In tests, GenSeg demonstrated strong performance across diverse medical imaging tasks, including identifying skin lesions in dermoscopy images, detecting breast cancer in ultrasound scans, analyzing placental vessels in fetoscopic images, spotting polyps in colonoscopy videos, and diagnosing foot ulcers from standard photos. It also worked effectively with 3D images, such as those used to map the hippocampus or liver. The results were striking. In low-data scenarios, GenSeg improved segmentation accuracy by 10 to 20% compared to conventional methods. It achieved this using only 8 to 20 times fewer real annotated images than standard approaches—potentially cutting data collection costs and time by a significant margin. Zhang explained that in a real-world setting, such as a dermatology clinic, a doctor might only need to annotate 40 skin lesion images instead of thousands. The AI would then use those few examples to help detect suspicious lesions in new patients quickly and accurately. The system’s success lies in its integrated feedback loop, where data generation and model training are tightly coupled. Rather than treating them as separate steps, GenSeg continuously refines its synthetic data based on how well the model performs, leading to more effective learning. Looking ahead, the research team plans to enhance the tool’s intelligence and adaptability. They also aim to incorporate direct feedback from clinicians during training, ensuring the synthetic data better reflect real-world clinical needs and improve practical utility. The findings were published in Nature Communications, highlighting a major step toward making advanced AI-powered diagnostics more accessible, especially in resource-limited healthcare settings.

Related Links