waifu-diffusion-xl-unofficial / README.md

Update README.md

e3e9ec8 about 1 year ago

6.25 kB

	---
	language:
	- en
	tags:
	- stable-diffusion-xl
	- text-to-image
	license: unknown
	inference: true

	---



	This unofficial repository hosts a diffusers-compatible float16 checkpoint of the [WDXL](https://huggingface.co/hakurei/waifu-diffusion-xl) base UNet.

	For convenience (i.e. for use in a StableDiffusionXLPipeline) we include mirrors of other models (please adhere to their terms of usage):

	- [SDXL 0.9](stabilityai/stable-diffusion-xl-base-0.9)
	- tokenizers
	- text encoders
	- scheduler config
	- [madebyollin's fp16 VAE](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix)

	## Usage (diffusers)

	### StableDiffusionXLPipeline

	Diffusers' StableDiffusionXLPipeline convention handles text encoders + UNet + VAE for you:

	```python
	from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
	from diffusers.pipelines.stable_diffusion_xl import StableDiffusionXLPipelineOutput
	import torch
	from torch import Generator
	from PIL import Image
	from typing import List

	# scheduler args documented here:
	# https://github.com/huggingface/diffusers/blob/main/src/diffusers/schedulers/scheduling_dpmsolver_multistep.py#L98
	scheduler: DPMSolverMultistepScheduler = DPMSolverMultistepScheduler.from_pretrained(
	'Birchlabs/waifu-diffusion-xl-unofficial',
	subfolder='scheduler',
	algorithm_type='sde-dpmsolver++',
	solver_order=2,
	# solver_type='heun' may give a sharper image. Cheng Lu reckons midpoint is better.
	solver_type='midpoint',
	use_karras_sigmas=True,
	)

	# pipeline args documented here:
	# https://github.com/huggingface/diffusers/blob/95b7de88fd0dffef2533f1cbaf9ffd9d3c6d04c8/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py#L548
	pipe: StableDiffusionXLPipeline = StableDiffusionXLPipeline.from_pretrained(
	'Birchlabs/waifu-diffusion-xl-unofficial',
	scheduler=scheduler,
	torch_dtype=torch.float16,
	use_safetensors=True,
	variant='fp16'
	)
	pipe.to('cuda')

	# StableDiffusionXLPipeline is hardcoded to cast the VAE to float32, but Ollin's VAE works fine in float16
	pipe.vae.to(torch.float16)

	prompt = 'masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, watercolor, night, turtleneck'
	negative_prompt = 'lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name'

	out: StableDiffusionXLPipelineOutput = pipe(
	prompt=prompt,
	negative_prompt=negative_prompt,
	num_inference_steps=25,
	guidance_scale=12.,
	original_size=(4096, 4096),
	target_size=(1024, 1024),
	height=1024,
	width=1024,
	generator=Generator().manual_seed(48),
	)

	images: List[Image.Image] = out.images
	img, *_ = images

	img.save('waifu.png')
	```

	You should get a picture like this:

	<img width="384px" height="384px" src="https://birchlabs.co.uk/share/wdxl-unofficial/0_48_waifu.png" title="seed 48: girl with green hair and sweater at night">

	### UNet2DConditionModel

	If you just want the UNet, you can load it like so:

	```python
	import torch
	from diffusers import UNet2DConditionModel

	base_unet: UNet2DConditionModel = UNet2DConditionModel.from_pretrained(
	'Birchlabs/waifu-diffusion-xl-unofficial',
	torch_dtype=torch.float16,
	use_safetensors=True,
	variant='fp16',
	subfolder='unet',
	).eval().to(torch.device('cuda'))
	```

	## How it was converted

	I used Kohya's converter script, to convert the official (`hakurei/waifu-diffusion-xl`) [`wdxl-aesthetic-0.9.safetensors`](https://huggingface.co/hakurei/waifu-diffusion-xl/blob/main/wdxl-aesthetic-0.9.safetensors). See [this commit](https://github.com/Birch-san/diffusers-play/commit/3f16355dd0064932d0bf356ed78676089b9e46ca).

	I forked [kohya's converter script](https://github.com/bmaltais/kohya_ss/blob/master/tools/convert_diffusers20_original_sd.py), making one [for SDXL](https://github.com/Birch-san/diffusers-play/blob/3f16355dd0064932d0bf356ed78676089b9e46ca/scripts/convert_diffusers20_original_sdxl.py).

	I invoked it like so:

	```bash
	python scripts/convert_diffusers20_original_sdxl.py \
	--fp16 \
	--use_safetensors \
	--reference_model stabilityai/stable-diffusion-xl-base-0.9 \
	in/wdxl-aesthetic-0.9.safetensors \
	out/wdxl-diffusers
	```

	### NOTE: The work here is a Work in Progress! Nothing in this repository is final.

	# waifu-diffusion-xl - Diffusion for Rich Weebs

	waifu-diffusion-xl is a latent text-to-image diffusion model that has been conditioned on high-quality anime images through fine-tuning StabilityAI's SDXL 0.9 model provided as a research preview.

	![image](https://user-images.githubusercontent.com/26317155/254350263-59eca9df-503d-4ee7-b12e-b060d8eebd60.png)

	<sub>masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, watercolor, night, turtleneck</sub>

	## Model Description(s)

	- [wdxl-aesthetic-0.9](https://huggingface.co/hakurei/waifu-diffusion-xl/blob/main/wdxl-aesthetic-0.9.safetensors) is a checkpoint that has been finetuned against our in-house aesthetic dataset which was created with the help of 15k aesthetic labels collected by volunteers. This model also used Stability.AI's [SDXL 0.9 checkpoint](https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9) as the base model for finetuning.

	## License

	This model has been released under the [SDXL 0.9 RESEARCH LICENSE AGREEMENT](https://huggingface.co/hakurei/waifu-diffusion-xl/blob/main/LICENSE.md) due to the repository containing the SDXL 0.9 weights before an official release. We have been given permission to release this model.

	## Downstream Uses

	This model can be used for entertainment purposes and as a generative art assistant.

	## Team Members and Acknowledgements

	This project would not have been possible without the incredible work by Stability AI and Novel AI.

	- [Haru](https://github.com/harubaru)
	- [Salt](https://github.com/sALTaccount/)
	- [closertodeath](https://huggingface.co/closertodeath)
	- [Kudo](https://negotiator.itch.io/)

	In order to reach us, you can join our [Discord server](https://discord.gg/touhouai).

	[![Discord Server](https://discordapp.com/api/guilds/930499730843250783/widget.png?style=banner2)](https://discord.gg/touhouai)