HyperAI

Sana High Resolution Image Synthesis

GitHub-Sana
Stars
arXiv-Paper
License

1. Tutorial Introduction

Sana was released in January 2025 and is jointly led by NVIDIA, MIT, and Tsinghua University. Sana is a text-to-image framework that can effectively generate images with a resolution of up to 4096 × 4096. Sana can synthesize high-resolution, high-quality images at a very fast speed and has strong text-image alignment capabilities.SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers", has been accepted by ICLR 2025.

This tutorial uses the Sana_1600M_1024px model for demonstration, and the computing power resource uses a single card 4090.

2. Operation steps

1. After starting the container, click the API address to enter the Web interface

If "Bad Gateway" is displayed, it means the model is initializing. Please wait for about 1-2 minutes and refresh the page.

2. Use Demonstration

Citation Information

Thanks to Github user SuperYang  For the deployment of this tutorial, the project reference information is as follows:

@misc{Sana2025,
  title={Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer},
  author={Enze Xie, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, Muyang Li, Ligeng Zhu, Yao Lu, Song Han},
  howpublished={\url{https://nvlabs.github.io/Sana/}},
  note={GitHub Repository with Code, Model & Documentation},
  year={2025}
}

Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓