openpose controlnet for flux.dev

(big thanks to oxen.ai for sponsoring the GPU for the training)

inference

an openpose controlnet for flux-dev, trained on https://huggingface.co/datasets/raulc0399/open_pose_controlnet

the controlnet model is trained for the xlabs ai pipeline https://github.com/XLabs-AI/x-flux

to install the pipeline, execute the following:

git clone https://github.com/XLabs-AI/x-flux.git
cd x-flux
python3 -m venv xflux_env
source xflux_env/bin/activate
pip install -r requirements.txt

to run the pipeline with controlnet:

python3 main.py \
 --prompt "person enjoying a day at the park, full hd, cinematic" \
 --image ~/open_pose_controlnet_dataset/validation_images/pose/3_pose_1024.jpg --control_type openpose \
 --local_path ./model.safetensors \
 --use_controlnet --model_type flux-dev \
 --width 1024 --height 1024  --timestep_to_start_cfg 2 \
 --num_steps 50 --true_gs 4 --guidance 4 \
 --save_path ~/gen_imgs

if the image has already been preprocessed comment out the line #146 from src/flux/xflux_pipeline.py

# self.annotator = Annotator(control_type, self.other_device)

training

oxen clone https://hub.oxen.ai/raulc/open_pose_controlnet_dataset
git clone https://github.com/raulc0399/x-flux.git
cd x-flux
git checkout open_pose_training
python3 -m venv xflux_env
source xflux_env/bin/activate
pip install -r requirements.txt
huggingface-cli login
accelerate config
mkdir images
rsync -r ~/open_pose_controlnet_dataset/train/images/ images/
cp train_configs/test_openpose_controlnet.yaml train_configs/openpose_controlnet.yaml
accelerate launch train_flux_deepspeed_controlnet.py --config "train_configs/openpose_controlnet.yaml"

note 1: check the file train_configs/openpose_controlnet.yaml before starting

note 2: rsync is needed, cp does not work with that many files

note 3: the oxen repo has the caption files as json as expected by the training script

results

using these 2 images:

control image 1 control image 2

with these prompts:

"two friends sitting by each other enjoying a day at the park, full hd, cinematic" "person enjoying a day at the park, full hd, cinematic"

resulted in these images:

result image 1 result image 2

License

Weights fall under the FLUX.1 [dev] Non-Commercial License

Downloads last month
1,644
Safetensors
Model size
744M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train raulc0399/flux_dev_openpose_controlnet