HyperAIHyperAI

Command Palette

Search for a command to run...

Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation

Zhuoyan Luo Fengyuan Shi Yixiao Ge Yujiu Yang Limin Wang Ying Shan

Abstract

We present Open-MAGVIT2, a family of auto-regressive image generation modelsranging from 300M to 1.5B. The Open-MAGVIT2 project produces an open-sourcereplication of Google's MAGVIT-v2 tokenizer, a tokenizer with a super-largecodebook (i.e., 2^{18} codes), and achieves the state-of-the-artreconstruction performance (1.17 rFID) on ImageNet 256 times 256.Furthermore, we explore its application in plain auto-regressive models andvalidate scalability properties. To assist auto-regressive models in predictingwith a super-large vocabulary, we factorize it into two sub-vocabulary ofdifferent sizes by asymmetric token factorization, and further introduce "nextsub-token prediction" to enhance sub-token interaction for better generationquality. We release all models and codes to foster innovation and creativity inthe field of auto-regressive visual generation.


Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing

HyperAI Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation | Papers | HyperAI