TokForge β SDXL PhotoMaker (Reference Identity) bundle
The single-subject reference-identity image route for the TokForge
Android app. Attach a photo of a person, then render that person in any scene
("me as a superhero, face visible"). PhotoMaker stacks the reference into the img
trigger word and merges an ID embedding into the SDXL UNet, so the rendered face stays
recognizable while the prompt drives the whole scene.
This bundle runs on the on-device stable-diffusion.cpp
engine (TokForge's PhotoMaker port) via the helper's native --photo-maker path. Full SDXL
at 1024Β² is heavy β this tier is offered on 16 GB-class phones. For 8 GB phones, use the
lighter darkmaniac7/TokForge-SD15-IPAdapter
plus-face tier instead.
Files
| File | Size | License | Contents |
|---|---|---|---|
realvisxl-v40-fp16.safetensors |
~6.9 GB | OpenRAIL++ | RealVisXL V4.0 (full, multi-step SDXL photoreal finetune) β dual CLIP text encoders + UNet + VAE in one self-contained f16 sd.cpp safetensors |
photomaker-v1.safetensors |
~934 MB | Apache-2.0 | PhotoMaker-v1 ID encoder + fuse module (TencentARC/PhotoMaker) β loaded via --photo-maker to condition identity on the attached reference photo |
MD5SUMS carries integrity hashes.
Why the FULL (non-Lightning) RealVisXL V4.0, and why f16 (not q8)
PhotoMaker conditions identity by merging its ID embedding LoRA-style into the SDXL UNet weights. This needs:
- A multi-step base, not a few-step Lightning checkpoint. PhotoMaker's ID merge takes
effect over a
start_merge_stepschedule across the full ~20β30 step denoise β a 6-step Lightning floor leaves too little room for the identity merge to converge. So this tier ships the full RealVisXL V4.0 (the same base validated in the PhotoMaker identity render-tests), not the Lightning variant the IP-Adapter tier ships. - An f16 (non-quantized) base. The LoRA-style merge requires F32-promotable weights; on a
quantized base the sd.cpp helper asserts (
src1 must be F32) on both the OpenCL and CPU backends. The base is therefore kept at f16 β same precision/quality choice as the sibling IP-Adapter bundle.
How TokForge uses it
In the app (16 GB+ phones): Image model picker β download "PhotoMaker SDXL (Reference Identity)" β attach a face photo as a reference under chat β prompt the scene. The engine is invoked as:
sd -M img_gen \
-m realvisxl-v40-fp16.safetensors \
--photo-maker photomaker-v1.safetensors \
--input-id-images-dir <dir_with_your_face.jpg> \
-p "a photo of a man img as a superhero, face visible, detailed face, looking at viewer" \
-n "<strong negative>" \
--steps 20 --cfg-scale 5.0 \
--sampling-method euler -H 1024 -W 1024
PhotoMaker keys identity to the
imgtrigger word β keepimgin the prompt right after the class word (a man img,a woman img). The app inserts this automatically. Keep the face visible and unobstructed for a recognizable identity.
Licenses
This is an aggregate of two independently-licensed components β each retains its own license:
- RealVisXL V4.0 base (
realvisxl-v40-fp16.safetensors) β OpenRAIL++ (SG161222/RealVisXL_V4.0, the SDXLopenrail++license). Use must comply with the OpenRAIL++ use-based restrictions. - PhotoMaker-v1 adapter (
photomaker-v1.safetensors) β Apache-2.0 (TencentARC/PhotoMaker).
Provenance
- Base:
realvisxl-v40-fp16.safetensors, a self-contained sd.cpp SDXL safetensors built fromSG161222/RealVisXL_V4.0(full, non-Lightning; fp16 variant weights) via the standard diffusers β single-file SDXL converter (convert_diffusers_to_original_sdxl.py --half). Same byte-for-byte layout as the TokForge IP-Adapter tier's SDXL base, full-multi-step weights. - Adapter:
photomaker-v1.safetensors, the sd.cpp-compatible PhotoMaker-v1 safetensors (TencentARC/PhotoMaker, Apache-2.0).
Model tree for darkmaniac7/TokForge-SDXL-PhotoMaker-GGUF
Base model
SG161222/RealVisXL_V4.0