minimaxir
/

sdxl-wrong-lora

stable-diffusion

stable-diffusion-diffusers

Model card Files Files and versions Community

sdxl-wrong-lora / README.md

minimaxir's picture

image upload test

c4de695 over 1 year ago

|

2.59 kB

	---
	license: mit
	base_model: stabilityai/stable-diffusion-xl-base-1.0
	tags:
	- stable-diffusion
	- stable-diffusion-diffusers
	- text-to-image
	- diffusers
	- lora
	inference: true
	---

	# sdxl-wrong-lora

	A LoRA for SDXL 1.0 Base which improves output image quality after loading it and using `wrong` as a negative prompt during inference.

	Benefits of using this LoRA:

	- Higher detail in textures/fabrics, particularly at full 1024x1024 resolution.
	- Higher color saturation and vibrance
	- Higher sharpness for blurry/background objects
	- Better at anatomically-correct hands
	- Less likely to have random artifacts
	- Appears to allow the model to follow the input prompt with a more expected behavior

	## Usage

	The LoRA can be loaded using `load_lora_weights` like any other LoRA in `diffusers`:

	```py
	import torch
	from diffusers import DiffusionPipeline, AutoencoderKL

	vae = AutoencoderKL.from_pretrained(
	"madebyollin/sdxl-vae-fp16-fix",
	torch_dtype=torch.float16
	)
	base = DiffusionPipeline.from_pretrained(
	"stabilityai/stable-diffusion-xl-base-1.0",
	vae=vae,
	torch_dtype=torch.float16,
	variant="fp16",
	use_safetensors=True
	)

	base.load_lora_weights("minimaxir/sdxl-wrong-lora")

	_ = base.to("cuda")
	```

	During inference, use `wrong` as the negative prompt.

	## Examples

	Left is the base model output (no LoRA) + refiner, right is base + LoRA and refiner. The generations use the same seed.

	`realistic human Shrek blogging at a computer workstation, hyperrealistic award-winning photo for vanity fair` (cfg = 13, seed = 56583700)

	![](img/example1.webp)

	## Methodology

	The methodology and motivation for creating this LoRA is similar to my [wrong SD 2.0 textual inversion embedding](https://huggingface.co/minimaxir/wrong_embedding_sd_2_0) by training on a balanced variety of undesirable outputs, except trained as a LoRA since textual inversion with SDXL is complicated. The base images were generated from SDXL itself, with some prompt weighting to emphasize undesirable attributes for test images.

	## Notes

	- The intuitive way to think about how this LoRA works is that on training start, it indicates an undersirable area of the vast highdimensional latent space which the rest of the diffusion process will move away from. This may work more effectively than textual inversion but more testing needs to be done.
	- It's possible to use `not wrong` in the normal prompt itself but in testing it has not much effect.
	- You can use other negative prompts in conjunction with the `wrong` prompt but you may want to weight them appropriately.