HyperAI

PhotoMaker V2 Generates Personalized Photo Pictures in Seconds Demo

PhotoMaker V2: Improved ID fidelity and greater control over V1

Tutorial Introduction

PhotoMaker is an efficient portrait customization model open-sourced by the Tencent team in 2024. It can quickly generate customized artistic style photos based on portraits. In addition to generating personalized portraits, it can also change the age and gender of the characters, integrate the characteristics of different characters to create new character information, and is a very practical AI painting tool. This tutorial is the 2.0 version of PhotoMaker. Compared with V1, it has greatly improved the consistency and controllability of the characters.

This tutorial has already set up the relevant environment. You only need to enter a command to experience the Demo.

Major improvements in PhotoMaker V2

  • ID fidelity is further improved, especially for single image input and Asian face input. Inputting more face images still produces better results.
  • By integrating ControlNet, twi-adapter and IP-Adapter, the generation process becomes more controllable. The research team provides corresponding scripts for reference. In addition, PhotoMaker V2 allows users to achieve better ID consistency by combining it with IP-Adapter-FaceID, InstantID and character LoRA.
  • PhotoMaker V2 inherits the good features of PhotoMaker V1, such as high-quality and diverse generation capabilities, as well as powerful text control. In addition, it can also integrate previous models, such as restoring people in old photos or paintings to reality, identity mixing, and changing age or gender.

Effect display

How to run

1. After cloning and starting the container, open the workspace

1

2. Create a new terminal and enter the command bash run.sh

3. After port 8080 appears, click the link at the API address on the right to enter the model experience

4. After entering the website, you can see the following interface

  • Upload the portrait image you want to use (you can upload multiple images)
  • Using English input prompts, the model will generate images based on the input prompts.

Note that the category vocabulary to be generated must use the trigger word img, such as man img, woman img, girl img.

  • Select the desired style from the Style template. These styles are some preset tips.
  • Click submit to generate the image.

There are some examples at the bottom of the website. Click them to load them directly.

You can also change the advanced settings according to your needs. The following are some parameter descriptions.

  • Negative Prompt: This specifies the features that should be avoided when generating the output. By inputting terms such as “bad symmetry, bad quality, low quality, illustration, 3D, 2D, painting, cartoon, sketch, open mouth”, the model will try to avoid including these features in the generated images.
  • Number of sample steps: This controls the number of steps the model takes when generating an image. More steps generally produce higher quality images because the model has more opportunities to refine the output.
  • Style strength: This indicates how much the specified style should influence the output image. The higher the percentage, the more influential the style will be.
  • Number of output images: This determines how many images the model should generate in one generation process
  • Guidance scale: This parameter adjusts how strictly the model should follow the prompt. A higher guidance scale means the model will follow the prompt more strictly, which may lead to more accurate but less creative results.
  • Seed: The seed value is used to initialize the random number generator and affect the output. By setting a specific seed, you can ensure the repeatability of the results. Checking Randomize seed will generate a different image each time.

Discussion and Exchange

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [Tutorial Exchange] to join the group to discuss various technical issues and share application effects↓