An error occurred in the Server Components render. The specific message is omitted in production builds to avoid leaking sensitive details. A digest property is included on this error instance which may provide additional details about the nature of the error.

Failed to load notebook details

1. Tutorial Introduction

Kiss3DGen is an open-source 3D generation and reconstruction framework developed by the EnVision-Research team and published in March 2025. It aims to efficiently transfer pre-trained 2D diffusion models to 3D content generation tasks. It supports high-quality multi-view rendering, 3D text generation, image-to-3D conversion, and 3D mesh reconstruction, integrating advanced modules such as Flux, Multiview, Caption, Reconstruction, and LLM. It also introduces 3D Bundle Image technology combined with normal maps and texture information to achieve accurate geometric reconstruction. Furthermore, it can be used with tools like ControlNet for 3D model enhancement and editing. This open-source framework is easy to deploy and has both academic research and practical application value. Related research papers are available. Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset GenerationIt has been included in CVPR 2025.

This tutorial uses a dual-GPU RTX a6000 setup. Project prompts are only available in English.

2. Project Examples

text-to-3D

image-to-3D

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

2. Usage steps

If "Bad Gateway" is displayed, it means that the model is initializing. Since the model is large, please wait about 5-7 minutes and then refresh the page.

text-to-3D

image-to-3D

Note: If you encounter an error, please use a smaller image. We recommend using an image smaller than 3 MB.

Parameter Description

Redux strength: Controls the degree to which the generated image is "redrawn/optimized". A higher value results in greater modifications and more detail changes to the original image; a lower value preserves more of the original generated details and structure. Value range: 0–1.
Denoising strength: Controls the degree of noise reduction during the generation process. Higher values (closer to 1) generate an image that is closer to the input prompt but with greater variation; lower values generate a result that is closer to the original image. Value range: 0–1.
Enable Redux: When enabled, an optimized redraw based on Redux strength will be automatically performed after the image is generated to improve image quality and detail.
Enable ControlNet: When enabled, ControlNet is allowed to be used during the generation process for structural or feature constraints (such as reference sketches, edge maps, depth maps, etc.), so that the generated image can meet specific structural requirements while maintaining its style.

4. Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Citation Information

The citation information for this project is as follows:

@article{lin2025kiss3dgen,
  title={Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation},
  author={Lin, Jiantao and Yang, Xin and Chen, Meixi and Xu, Yingjie and Yan, Dongyu and Wu, Leyi and Xu, Xinli and Xu, Lie and Zhang, Shunsi and Chen, Ying-Cong},
  journal={arXiv preprint arXiv:2503.01370},
  year={2025}
}

This notebook is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Notebooks

IndexTTS-2: Breaking Through the Bottlenecks of Autoregressive TTS Duration and Emotion Control

3 months ago

Supertonic: A high-speed TTS Speech Synthesis Model Based on ONNX

2 months ago

SoulX-Podcast: Podcast-quality long-text Speech Generation for Multiple dialects.

2 months ago

Depth-Anything-3: Restoring Visual Space From Any Perspective

2 months ago

FLUX.2-dev: Image Generation and Editing Model

2 months ago

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Run this Notebook Discuss on Discord

Failed to load notebook details

1. Tutorial Introduction

This tutorial uses a dual-GPU RTX a6000 setup. Project prompts are only available in English.

2. Project Examples

text-to-3D

image-to-3D

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

2. Usage steps

If "Bad Gateway" is displayed, it means that the model is initializing. Since the model is large, please wait about 5-7 minutes and then refresh the page.

text-to-3D

image-to-3D

Note: If you encounter an error, please use a smaller image. We recommend using an image smaller than 3 MB.

Parameter Description

Redux strength: Controls the degree to which the generated image is "redrawn/optimized". A higher value results in greater modifications and more detail changes to the original image; a lower value preserves more of the original generated details and structure. Value range: 0–1.
Denoising strength: Controls the degree of noise reduction during the generation process. Higher values (closer to 1) generate an image that is closer to the input prompt but with greater variation; lower values generate a result that is closer to the original image. Value range: 0–1.
Enable Redux: When enabled, an optimized redraw based on Redux strength will be automatically performed after the image is generated to improve image quality and detail.
Enable ControlNet: When enabled, ControlNet is allowed to be used during the generation process for structural or feature constraints (such as reference sketches, edge maps, depth maps, etc.), so that the generated image can meet specific structural requirements while maintaining its style.

4. Discussion

Citation Information

The citation information for this project is as follows:

@article{lin2025kiss3dgen,
  title={Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation},
  author={Lin, Jiantao and Yang, Xin and Chen, Meixi and Xu, Yingjie and Yan, Dongyu and Wu, Leyi and Xu, Xinli and Xu, Lie and Zhang, Shunsi and Chen, Ying-Cong},
  journal={arXiv preprint arXiv:2503.01370},
  year={2025}
}

Related Notebooks

F5-E2 TTS Clones Any Sound in Just 3 Seconds

2 months ago

Krea-realtime-video: Real-time Video Generation Model

2 months ago

LongCat-Video: Meituan's open-source AI Video Generation Model

3 months ago

IndexTTS-2: Breaking Through the Bottlenecks of Autoregressive TTS Duration and Emotion Control

3 months ago

Supertonic: A high-speed TTS Speech Synthesis Model Based on ONNX

2 months ago

SoulX-Podcast: Podcast-quality long-text Speech Generation for Multiple dialects.

2 months ago

Depth-Anything-3: Restoring Visual Space From Any Perspective

2 months ago

FLUX.2-dev: Image Generation and Editing Model

2 months ago

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Kiss3DGen: A 3D Asset Generation Framework Based on an Image Diffusion Model

1. Tutorial Introduction

2. Project Examples

3. Operation steps

4. Discussion

Citation Information

Build AI with AI

HyperAI Newsletters

Command Palette

Kiss3DGen: A 3D Asset Generation Framework Based on an Image Diffusion Model

1. Tutorial Introduction

2. Project Examples

3. Operation steps

4. Discussion

Citation Information

Related Notebooks

F5-E2 TTS Clones Any Sound in Just 3 Seconds

Krea-realtime-video: Real-time Video Generation Model

LongCat-Video: Meituan's open-source AI Video Generation Model

IndexTTS-2: Breaking Through the Bottlenecks of Autoregressive TTS Duration and Emotion Control

Supertonic: A high-speed TTS Speech Synthesis Model Based on ONNX

SoulX-Podcast: Podcast-quality long-text Speech Generation for Multiple dialects.

Depth-Anything-3: Restoring Visual Space From Any Perspective

FLUX.2-dev: Image Generation and Editing Model

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

Build AI with AI

HyperAI Newsletters

Command Palette

Kiss3DGen: A 3D Asset Generation Framework Based on an Image Diffusion Model

1. Tutorial Introduction

2. Project Examples

3. Operation steps

4. Discussion

Citation Information

Related Notebooks

F5-E2 TTS Clones Any Sound in Just 3 Seconds

Krea-realtime-video: Real-time Video Generation Model

LongCat-Video: Meituan's open-source AI Video Generation Model

IndexTTS-2: Breaking Through the Bottlenecks of Autoregressive TTS Duration and Emotion Control

Supertonic: A high-speed TTS Speech Synthesis Model Based on ONNX

SoulX-Podcast: Podcast-quality long-text Speech Generation for Multiple dialects.

Depth-Anything-3: Restoring Visual Space From Any Perspective

FLUX.2-dev: Image Generation and Editing Model

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

Build AI with AI

HyperAI Newsletters

Related Notebooks

F5-E2 TTS Clones Any Sound in Just 3 Seconds

Krea-realtime-video: Real-time Video Generation Model

LongCat-Video: Meituan's open-source AI Video Generation Model

IndexTTS-2: Breaking Through the Bottlenecks of Autoregressive TTS Duration and Emotion Control

Supertonic: A high-speed TTS Speech Synthesis Model Based on ONNX

SoulX-Podcast: Podcast-quality long-text Speech Generation for Multiple dialects.

Depth-Anything-3: Restoring Visual Space From Any Perspective

FLUX.2-dev: Image Generation and Editing Model

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

Related Notebooks

F5-E2 TTS Clones Any Sound in Just 3 Seconds

Krea-realtime-video: Real-time Video Generation Model

LongCat-Video: Meituan's open-source AI Video Generation Model

IndexTTS-2: Breaking Through the Bottlenecks of Autoregressive TTS Duration and Emotion Control

Supertonic: A high-speed TTS Speech Synthesis Model Based on ONNX

SoulX-Podcast: Podcast-quality long-text Speech Generation for Multiple dialects.

Depth-Anything-3: Restoring Visual Space From Any Perspective

FLUX.2-dev: Image Generation and Editing Model

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model