HyperAI

Hyper-SD Real-time Painting

Hyper-SD: A Trajectory Segmentation Consistency Model for Efficient Image Synthesis

1. Tutorial Introduction

该教程仅用 RTX 4040 即可启动,注意:prompt 仅支持英文

Hyper-SD is an innovative image synthesis framework launched by ByteDance in 2024, which aims to improve the efficiency and performance of diffusion models in image synthesis tasks. It significantly improves the efficiency of image synthesis while maintaining the high quality of generated images through the Trajectory Segmented Consistency Distillation (TSCD) technology.

Key features of Hyper-SD include:

  • Trajectory Segmentation Consistency Distillation (TSCD): By gradually performing consistency distillation within predefined time steps, the original ODE (ordinary differential equation) trajectory is effectively preserved while reducing the inference steps.
  • Human feedback learning: Incorporating human aesthetic preferences for generated images, the model performance is optimized through feedback learning, which significantly improves image quality, especially in low-step reasoning situations.
  • Unified LoRA model: A unified LoRA model supporting 1 to 8-step reasoning is proposed, providing flexibility for users with different needs while ensuring consistency of reasoning at all times.
  • Performance improvement: In few-step reasoning, Hyper-SD surpasses the existing technology on multiple evaluation metrics, including CLIP Score and Aes Score, demonstrating its leading position in image synthesis tasks.
  • Hyper-SD achieves SOTA-level image generation performance in 1 to 8 generation steps on both SDXL and SD1.5 architectures. For example, Hyper-SDXL's CLIP score and Aes score in 1-step reasoning are +0.68 and +0.51 higher than SDXL-Lightning, respectively. In addition, the open source nature of Hyper-SD promotes the development of the generative AI community, allowing researchers and developers to further explore and improve the model.

2. Operation steps

1. 克隆并启动容器后点击 API 地址即可进入 Web 界面(由于模型较大,加载需要 1-2 分钟后才会在 API 界面)
2. 可以选择设置 prompt 和相关参数,然后继续创作,可选择对采样参数进行调整
  • Number of Images: The number of images generated.
  • Inference Steps: Number of inference steps.
  • Prompt: The content of the image to be generated
  • ControlNet Conditioning Scale: ControlNet Conditioner
  • Seed: Random seed number, as shown below
在左侧进行绘制即可即可实时看到生成图像变化