🎨 StyQA: A Quasi-Agent Framework for Vesatile Style Transfer

🔬 Introduction

StyQA is a unified quasi-agent framework for versatile style transfer tasks (pixel-level, semantic-level and continuous style transfer, etc.). This agent is attempted to be prompted with style transfer pipeline and normal style transfer instructions.

Typically, StyQA will conduct style-analysis $\rightarrow$ style-transfer $\rightarrow$ style-criteria pipeline, and iteratively refines the outputs based on the maximum refinement times. For convenient usage, we provide single stage calling, StyQA can run a single stage or combination of stages.

🚀 Quick Start

Environment

StyQA relies on uv and we provided pyproject.toml in this repo. Note that the embedded base model relies on the newest diffusers which should be installed with:

uv pip install git+https://github.com/huggingface/diffusers

Computational Resource: Make sure your device satisfies the computation requirements:

The model with bfloat16 will comsume about 52G. If LoRA is used, each LoRA module consumes about 1G. So for better inference with at least $1024\times 1024$ pictures, $\geq 80$G is required.

Demos

main.py provides demos for continuous style transfer:

@hydra.main(version_base="v1.2", config_path="configs", config_name="agent")
def main(cfgs: OmegaConf):
    prompt = "Convert the pixel colors of Picture 1 into the pixel colors of Picture 2 with strength 0.75 and then transfer the semantic style into Picture 3."
    cnt_image_path = "demos/content1.jpg"
    ref_image_paths = ["demos/pixel.jpg", "demos/semantic1.jpg"]

    agent: StyQA = instantiate(cfgs.agent)
    user_input = UserInput(
        prompt=prompt,
        cnt_image_path=cnt_image_path,
        ref_image_paths=ref_image_paths,
    )
    agent.run_pipeline(user_input)

Simply run

python main.py

and you will get the outputs like:

The agent configurations (seed, num_inference_steps, etc.) can be modified in configs/agent.yaml, or you can modify them using bash command thanks to hydra.

python main.py agent.seed=1234 agent.num_inference_steps=16

Note that the prompt can points a specific value and style transfer task types explicitly (it is what we recommanded) or with an implicit representation such as

"Using the colors and textures from Picture 2 ... moderate strength ... as if they are in same style category ..."

🖼️ Visualization

We provide more visualization performances.

Pixel-level style transfer

Similar to arbitrary image style transfer, Pixel-level style transfer aims to utilize the color and textures features from low level pixels.

Semantic-level style transfer

Artistic styles can be devided into different categories, semantic-level style transfer aims to re-generate content image using the same style category, which we call style-centric semantic features.

Continuous style transfer

Based on different pipeline commands, StyQA will parse them into workflow and conducts style transfer one-by-one, here we provide somes demos.

📝 TODO

Open-source inference code
Deliver demos
Uploads prepared LoRA weights
Open-source training/fine-tuning code
More experiments

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support