--- title: PhotoMaker app_file: bm.py sdk: gradio sdk_version: 4.14.0 ---

## PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding [[Paper](https://huggingface.co/papers/2312.04461)]   [[Project Page](https://photo-maker.github.io)]   [[Model Card](https://huggingface.co/TencentARC/PhotoMaker)]
[[🤗 Demo (Realistic)](https://huggingface.co/spaces/TencentARC/PhotoMaker)]   [[🤗 Demo (Stylization)](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style)] If the ID fidelity is not enough for you, please try our [stylization application](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style), you may be pleasantly surprised.
--- Official implementation of **[PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding](https://huggingface.co/papers/2312.04461)**. ### 🌠 **Key Features:** 1. Rapid customization **within seconds**, with no additional LoRA training. 2. Ensures impressive ID fidelity, offering diversity, promising text controllability, and high-quality generation. 3. Can serve as an **Adapter** to collaborate with other Base Models alongside LoRA modules in community. --- ❗❗ Note: If there are any PhotoMaker based resources and applications, please leave them in the [discussion](https://github.com/TencentARC/PhotoMaker/discussions/36) and we will list them in the [Related Resources](https://github.com/TencentARC/PhotoMaker?tab=readme-ov-file#related-resources) section in README file.
## 🚩 **New Features/Updates** - ✅ Jan. 15, 2024. We release PhotoMaker. --- ## 🔥 **Examples** ### Realistic generation - [![Huggingface PhotoMaker](https://img.shields.io/static/v1?label=Demo&message=Huggingface%20Gradio&color=orange)](https://huggingface.co/spaces/TencentARC/PhotoMaker) - [**PhotoMaker notebook demo**](photomaker_demo.ipynb)

### Stylization generation Note: only change the base model and add the LoRA modules for better stylization - [![Huggingface PhotoMaker-Style](https://img.shields.io/static/v1?label=Demo&message=Huggingface%20Gradio&color=orange)](https://huggingface.co/spaces/TencentARC/PhotoMaker-Style) - [**PhotoMaker-Style notebook demo**](photomaker_style_demo.ipynb)

# 🔧 Dependencies and Installation - Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html)) - [PyTorch >= 2.0.0](https://pytorch.org/) ```bash conda create --name photomaker python=3.10 conda activate photomaker pip install -U pip # Install requirements pip install -r requirements.txt # Install photomaker pip install git+https://github.com/TencentARC/PhotoMaker.git ``` Then you can run the following command to use it ```python from photomaker import PhotoMakerStableDiffusionXLPipeline ``` # ⏬ Download Models The model will be automatically downloaded through following two lines: ```python from huggingface_hub import hf_hub_download photomaker_path = hf_hub_download(repo_id="TencentARC/PhotoMaker", filename="photomaker-v1.bin", repo_type="model") ``` You can also choose to download manually from this [url](https://huggingface.co/TencentARC/PhotoMaker). # 💻 How to Test ## Use like [diffusers](https://github.com/huggingface/diffusers) - Dependency ```py import torch import os from diffusers.utils import load_image from diffusers import EulerDiscreteScheduler from photomaker import PhotoMakerStableDiffusionXLPipeline ### Load base model pipe = PhotoMakerStableDiffusionXLPipeline.from_pretrained( base_model_path, # can change to any base model based on SDXL torch_dtype=torch.bfloat16, use_safetensors=True, variant="fp16" ).to(device) ### Load PhotoMaker checkpoint pipe.load_photomaker_adapter( os.path.dirname(photomaker_path), subfolder="", weight_name=os.path.basename(photomaker_path), trigger_word="img" # define the trigger word ) pipe.scheduler = EulerDiscreteScheduler.from_config(pipe.scheduler.config) ### Also can cooperate with other LoRA modules # pipe.load_lora_weights(os.path.dirname(lora_path), weight_name=lora_model_name, adapter_name="xl_more_art-full") # pipe.set_adapters(["photomaker", "xl_more_art-full"], adapter_weights=[1.0, 0.5]) pipe.fuse_lora() ``` - Input ID Images ```py ### define the input ID images input_folder_name = './examples/newton_man' image_basename_list = os.listdir(input_folder_name) image_path_list = sorted([os.path.join(input_folder_name, basename) for basename in image_basename_list]) input_id_images = [] for image_path in image_path_list: input_id_images.append(load_image(image_path)) ```
- Generation ```py # Note that the trigger word `img` must follow the class word for personalization prompt = "a half-body portrait of a man img wearing the sunglasses in Iron man suit, best quality" negative_prompt = "(asymmetry, worst quality, low quality, illustration, 3d, 2d, painting, cartoons, sketch), open mouth, grayscale" generator = torch.Generator(device=device).manual_seed(42) images = pipe( prompt=prompt, input_id_images=input_id_images, negative_prompt=negative_prompt, num_images_per_prompt=1, num_inference_steps=num_steps, start_merge_step=10, generator=generator, ).images[0] gen_images.save('out_photomaker.png') ```
## Start a local gradio demo Run the following command: ```python python gradio_demo/app.py ``` You could customize this script in [this file](gradio_demo/app.py). ## Usage Tips: - Upload more photos of the person to be customized to improve ID fidelty. If the input is Asian face(s), maybe consider adding 'asian' before the class word, e.g., `asian woman img` - When stylizing, does the generated face look too realistic? Adjust the Style strength to 30-50, the larger the number, the less ID fidelty, but the stylization ability will be better. You could also try out other base models or LoRAs with good stylization effects. - For faster speed, reduce the number of generated images and sampling steps. However, please note that reducing the sampling steps may compromise the ID fidelity. # Related Resources - [Replicate demo of PhotoMaker](https://replicate.com/jd7h/photomaker) by [@yorickvP](https://github.com/yorickvP), transfer PhotoMaker to replicate. - [Windows version of PhotoMaker](https://github.com/bmaltais/PhotoMaker/tree/v1.0.1) by [@bmaltais](https://github.com/bmaltais), easy to deploy PhotoMaker on Windows. The description can be found in [this link](https://github.com/TencentARC/PhotoMaker/discussions/36#discussioncomment-8156199). # 🤗 Acknowledgements - PhotoMaker is co-hosted by Tencent ARC Lab and Nankai University [MCG-NKU](https://mmcheng.net/cmm/). - Inspired from many excellent demos and repos, including [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter), [multimodalart/Ip-Adapter-FaceID](https://huggingface.co/spaces/multimodalart/Ip-Adapter-FaceID), [FastComposer](https://github.com/mit-han-lab/fastcomposer), and [T2I-Adapter](https://github.com/TencentARC/T2I-Adapter). Thanks for their great works! - Thanks for Venus team in Tencent PCG for their feedback and suggestions. - Thanks for HuggingFace team for their generous support! # Disclaimer This project strives to positively impact the domain of AI-driven image generation. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it in a responsible manner. The developers do not assume any responsibility for potential misuse by users. # BibTeX If you find PhotoMaker useful for your research and applications, please cite using this BibTeX: ```bibtex @article{li2023photomaker, title={PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding}, author={Li, Zhen and Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Cheng, Ming-Ming and Shan, Ying}, booktitle={arXiv preprint arxiv:2312.04461}, year={2023} }