Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification

Generative modeling, representation learning, and classification are threecore problems in machine learning (ML), yet their state-of-the-art (SoTA)solutions remain largely disjoint. In this paper, we ask: Can a unifiedprinciple address all three? Such unification could simplify ML pipelines andfoster greater synergy across tasks. We introduce Latent Zoning Network (LZN)as a step toward this goal. At its core, LZN creates a shared Gaussian latentspace that encodes information across all tasks. Each data type (e.g., images,text, labels) is equipped with an encoder that maps samples to disjoint latentzones, and a decoder that maps latents back to data. ML tasks are expressed ascompositions of these encoders and decoders: for example, label-conditionalimage generation uses a label encoder and image decoder; image embedding usesan image encoder; classification uses an image encoder and label decoder. Wedemonstrate the promise of LZN in three increasingly complex scenarios: (1) LZNcan enhance existing models (image generation): When combined with the SoTARectified Flow model, LZN improves FID on CIFAR10 from 2.76 to 2.59-withoutmodifying the training objective. (2) LZN can solve tasks independently(representation learning): LZN can implement unsupervised representationlearning without auxiliary loss functions, outperforming the seminal MoCo andSimCLR methods by 9.3% and 0.2%, respectively, on downstream linearclassification on ImageNet. (3) LZN can solve multiple tasks simultaneously(joint generation and classification): With image and label encoders/decoders,LZN performs both tasks jointly by design, improving FID and achieving SoTAclassification accuracy on CIFAR10. The code and trained models are availableat https://github.com/microsoft/latent-zoning-networks. The project website isat https://zinanlin.me/blogs/latent_zoning_networks.html.