ostris/photo-maker-face-sdxl

Note: This is mainly only useful if you are writing your own fine tuning script. If you just want to run inference, please visit the PhotoMaker model. TencentARC/PhotoMaker

These are just chunks of the weights broken out of TencentARC/PhotoMaker to allow easier fine tuning and loading of the individual pieces. The weights here are identical to the original in every other way.

The CLIP vision model can be loaded with

image_preprocessor = CLIPImageProcessor.from_pretrained("ostris/photo-maker-face-sdxl")
clip_vision = CLIPVisionModelWithProjection.from_pretrained(
    "ostris/photo-maker-face-sdxl",
    ignore_mismatched_sizes=True
)

It will warn about additional weights because the fuse_model and visual_projection_2 are included in the file but not needed for CLIP.

Using the included python file (modified only to handle from pretrained for now), the PhotoMakerIDEncoder can be loaded with

id_encoder = PhotoMakerIDEncoder.from_pretrained("ostris/photo-maker-face-sdxl")

The fuse weights are included in the vision encoder, but are also seperated out in pytorch_fuse_module_weights.safetensors so they can be loaded seperatly if only fine tuning the fuse_module and / or LoRA.

The LoRA can also be loaded seperatly with standard Diffusers LoRA loading.

pipeline.load_lora_weights("ostris/photo-maker-face-sdxl", adapter_name="photomaker")