HyperAI

In-Context Edit: Command-driven Image Generation and Editing

Project Page
GitHub
License
GitHub Stars
arXiv

1. Tutorial Introduction

In-Context Edit is an efficient framework for instruction-based image editing released by Zhejiang University and Harvard University on April 29, 2025. Compared with previous methods, ICEdit has only 1% of trainable parameters (200M) and 0.1% of training data (50k), showing strong generalization ability and capable of handling various editing tasks. Compared with commercial models such as Gemini and GPT4o, it is more open source, lower cost, faster and more powerful. The related paper results are "In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer".

This tutorial uses a single RTX 4090 card. If you want to achieve the official 9 seconds to generate images, you need a graphics card with higher configuration. This project currently only supports English text descriptions.

Models used in this project:

  • normal-lora
  • FLUX.1-Fill-dev

2. Project Examples

Comparison with other business models 

3. Operation steps

1. After starting the container, click the API address to enter the Web interface

If "Bad Gateway" is displayed, it means the model is initializing. Since the model is large, please wait about 1-2 minutes and refresh the page.

2. Use Demonstration

❗️Important usage tips:

  • Guidance Scale: It is used to control the influence of conditional input (such as text or image) on the generated results in the generative model. A higher guidance value will make the generated results closer to the input conditions, while a lower value will retain more randomness.
  • Number of inference steps: Indicates the number of iterations of the model or the number of steps in the inference process, representing the number of optimization steps the model uses to produce the result. A higher number of steps generally produces more refined results, but may increase the computation time.
  • Seed: The random number seed is used to control the randomness of the generation process. The same Seed value can generate the same results (provided that other parameters are the same), which is very important in reproducing the results.

4. Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Citation Information

Thanks to Github user SuperYang  Deployment of this tutorial. The reference information of this project is as follows:

@misc{zhang2025ICEdit,
      title={In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer}, 
      author={Zechuan Zhang and Ji Xie and Yu Lu and Zongxin Yang and Yi Yang},
      year={2025},
      eprint={2504.20690},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2504.20690}, 
}