1. Tutorial Introduction

EvoSearch-codes, launched on May 1, 2025, by the Hong Kong University of Science and Technology and the Kuaishou Keling team, is an Evolutionary Search method. It significantly improves the generation quality of models by increasing computational cost during inference, supporting image and video generation, and compatible with state-of-the-art diffusion-based and flow-based models. EvoSearch achieves significant state-of-the-art results on a range of tasks without training or gradient updates, demonstrating good scaling ability, robustness, and generalization. With increased test-time computation, EvoSearch shows that SD2.1 and Flux.1-dev have the potential to rival or even surpass GPT-4o. For video generation, Wan 1.3B also outperforms Wan 14B and Hunyuan 13B, demonstrating the potential and research space for test-time scaling to complement training-time scaling. Related papers are available. Scaling Image and Video Generation via Test-Time Evolutionary Search .

This tutorial uses a single RTX A6000 card as the resource. This tutorial provides three examples for testing: Wan Video Generation, SD Image Generation, and FLUX Image Generation.

2. Project Examples

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

2. Usage steps

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 2-3 minutes and refresh the page.

2.1 Wan Video Generation

Tip: The video will take approximately 5 – 8 minutes to generate.

Parameter Description

Advanced Settings
- Random Seed: Random seed.
- Height: Video generation height.
- Width: Video generation width.
- Video duration: Controls the video duration.
- Inference Steps: Inference steps.
- Guidance Scale: Controls the strength of the influence of textual cues on the generated video.
- Iteration: number of iterations.

2.2 SD Image Generation

Tip: It is better to use English as the prompt word.

Advanced Settings
- Random Seed: Random seed.
- Image Size: Image size.
- Inference Steps: Inference steps.
- CFG Scale: Controls the strength of the influence of textual cues on the generated image.
- Iteration: number of iterations.

2.3 FLUX Image Generation

4. Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Citation Information

The citation information for this project is as follows:

@misc{he2025scaling,
    title={Scaling Image and Video Generation via Test-Time Evolutionary Search},
    author={Haoran He and Jiajun Liang and Xintao Wang and Pengfei Wan and Di Zhang and Kun Gai and Ling Pan},
    year={2025},
    eprint={2505.17618},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

This notebook is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at [email protected] for prompt review and removal.

Related Notebooks

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Run this Notebook

Date

7 months ago

Size

964.11 MB

1. Tutorial Introduction

This tutorial uses a single RTX A6000 card as the resource. This tutorial provides three examples for testing: Wan Video Generation, SD Image Generation, and FLUX Image Generation.

2. Project Examples

3. Operation steps