HyperAIHyperAI

SRPO: Image Generation Says Goodbye to AI!

1. Tutorial Introduction

Build
License

SRPO is a text-to-image generation model jointly launched by the Tencent Hunyuan team, the School of Science of the Chinese University of Hong Kong, Shenzhen, and the Shenzhen International Graduate School of Tsinghua University in September 2025. By designing the reward signal as a text-conditional signal, it realizes online adjustment of the reward and reduces the dependence on offline reward fine-tuning. SRPO introduces Direct-Align technology, which directly restores the original image from any time step through predefined noise priors, avoiding the problem of over-optimization at later time steps. Experiments on the FLUX.1.dev model show that SRPO can significantly improve the human-assessed realism and aesthetic quality of the generated images, and the training efficiency is extremely high, and the optimization can be completed in only 10 minutes. The relevant paper results are "Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference".

This tutorial uses a single A6000 GPU as computing resource. This model currently only supports English prompts.

2. Effect display

3. Operation steps

1. Start the container

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

2. Usage steps

Specific parameters:

  • Prompt: You can enter a text description here.
  • Width: Image width.
  • Height: The height of the image.
  • Guidance Scale: Guidance scale, used to control the influence of text prompts on the final result during image generation.
  • Inference Steps: The number of inference steps controls the number of iterations of the generation process, affecting the generation quality and calculation time.
  • Seed: Random number seed, used to control the initial value of the randomness generation process.
  • Seed Used: The seed used.

4. Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Citation Information

The citation information for this project is as follows:

@misc{shen2025directlyaligningdiffusiontrajectory,
      title={Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference}, 
      author={Xiangwei Shen and Zhimin Li and Zhantao Yang and Shiyi Zhang and Yingfang Zhang and Donghao Li and Chunyu Wang and Qinglin Lu and Yansong Tang},
      year={2025},
      eprint={2509.06942},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2509.06942}, 
}