Instructions to use adirik/albedo-controlnet-sd3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use adirik/albedo-controlnet-sd3 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("adirik/albedo-controlnet-sd3", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
license: other license_name: stability-ai-community license_link: https://huggingface.co/stabilityai/stable-diffusion-3.5-medium/blob/main/LICENSE base_model: stabilityai/stable-diffusion-3.5-medium tags: - stable-diffusion - stable-diffusion-3 - controlnet - albedo - text-to-image - image-to-image library_name: diffusers pipeline_tag: image-to-image
Albedo-Conditioned ControlNet for Stable Diffusion 3.5
This is an albedo-conditioned ControlNet model trained on Stable Diffusion 3.5 Medium.
Model Details
- Base Model: Stable Diffusion 3.5 Medium
- Checkpoint: {checkpoint_name}
- Conditioning: Albedo maps + text prompts
- Resolution: 512x512 (can be adapted to other resolutions)
- Training Dataset: PixelProse (albedo + RGB pairs with captions)
Usage
import torch
from diffusers import AutoencoderKL, SD3Transformer2DModel
from transformers import CLIPTokenizer, T5TokenizerFast
from PIL import Image
import numpy as np
# Load base model components
base_model = "stabilityai/stable-diffusion-3.5-medium"
vae = AutoencoderKL.from_pretrained(base_model, subfolder="vae")
# Load trained transformer
transformer = SD3Transformer2DModel.from_pretrained(
"{model_id}",
subfolder="transformer",
torch_dtype=torch.bfloat16
)
# Load your custom pipeline (from training repo)
from pipelines.pipeline_stable_diffusion_3 import StableDiffusion3Pipeline
pipeline = StableDiffusion3Pipeline.from_pretrained(
base_model,
transformer=transformer,
vae=vae,
torch_dtype=torch.bfloat16,
)
pipeline.to("cuda")
# Load and prepare albedo image
albedo_image = Image.open("path/to/albedo.png").convert("RGB")
albedo_image = albedo_image.resize((512, 512))
# Convert to tensor and normalize
albedo_np = np.array(albedo_image).astype(np.float32) / 255.0
albedo_tensor = torch.from_numpy(albedo_np).permute(2, 0, 1) * 2.0 - 1.0
albedo_tensor = albedo_tensor.unsqueeze(0).unsqueeze(0).to("cuda", dtype=torch.bfloat16)
# Encode albedo to control latents
from light_utils import encode_intrinsics
control_latents = encode_intrinsics(albedo_tensor, vae, torch.bfloat16)
# Generate
prompt = "A beautiful landscape, soft golden hour lighting"
image = pipeline(
prompt=prompt,
control_image=control_latents,
num_inference_steps=50,
guidance_scale=7.5,
height=512,
width=512,
).images[0]
image.save("output.png")
Lighting Control
The model responds well to lighting descriptions in prompts:
# Different lighting conditions
prompts = [
"A forest scene, at sunrise",
"A forest scene, with fluorescent blue lighting",
]
for prompt in prompts:
image = pipeline(
prompt=prompt,
control_image=control_latents,
num_inference_steps=50,
).images[0]
# Each will have different lighting/mood
License
This model inherits the license from Stable Diffusion 3.5 Medium. See: https://huggingface.co/stabilityai/stable-diffusion-3.5-medium/blob/main/LICENSE
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support