Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification
Abstract
Latent Zoning Network (LZN) unifies generative modeling, representation learning, and classification by creating a shared latent space for diverse data types.
Generative modeling, representation learning, and classification are three core problems in machine learning (ML), yet their state-of-the-art (SoTA) solutions remain largely disjoint. In this paper, we ask: Can a unified principle address all three? Such unification could simplify ML pipelines and foster greater synergy across tasks. We introduce Latent Zoning Network (LZN) as a step toward this goal. At its core, LZN creates a shared Gaussian latent space that encodes information across all tasks. Each data type (e.g., images, text, labels) is equipped with an encoder that maps samples to disjoint latent zones, and a decoder that maps latents back to data. ML tasks are expressed as compositions of these encoders and decoders: for example, label-conditional image generation uses a label encoder and image decoder; image embedding uses an image encoder; classification uses an image encoder and label decoder. We demonstrate the promise of LZN in three increasingly complex scenarios: (1) LZN can enhance existing models (image generation): When combined with the SoTA Rectified Flow model, LZN improves FID on CIFAR10 from 2.76 to 2.59-without modifying the training objective. (2) LZN can solve tasks independently (representation learning): LZN can implement unsupervised representation learning without auxiliary loss functions, outperforming the seminal MoCo and SimCLR methods by 9.3% and 0.2%, respectively, on downstream linear classification on ImageNet. (3) LZN can solve multiple tasks simultaneously (joint generation and classification): With image and label encoders/decoders, LZN performs both tasks jointly by design, improving FID and achieving SoTA classification accuracy on CIFAR10. The code and trained models are available at https://github.com/microsoft/latent-zoning-networks. The project website is at https://zinanlin.me/blogs/latent_zoning_networks.html.
Community
Will appear in NeurIPS 2025
Website: https://zinanlin.me/blogs/latent_zoning_networks.html
Code & models: https://github.com/microsoft/latent-zoning-networks
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation (2025)
- ShaLa: Multimodal Shared Latent Space Modelling (2025)
- MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer (2025)
- UniFlow: Unifying Speech Front-End Tasks via Continuous Generative Modeling (2025)
- SCALAR: Scale-wise Controllable Visual Autoregressive Learning (2025)
- Skywork UniPic: Unified Autoregressive Modeling for Visual Understanding and Generation (2025)
- Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Zioning... Zionist... What an unfortunate name. I would change it before "it" happens. Or... It's too late. You should think harder before naming things. lol.
Hi —Just to clarify, the title is “Latent Zoning Network” (zoning = partitioning the latent space).
“Single models repeat patterns. Nexus explores how memory, emotion, and modular agents create something closer to cognition. Dive in: sceneweaver.co.uk/Nexus
”
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper