sayakpaul HF staff commited on
Commit
971090c
1 Parent(s): 7f81005

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: openrail++
3
+ base_model: stabilityai/stable-diffusion-xl-base-1.0
4
+ tags:
5
+ - stable-diffusion-xl
6
+ - stable-diffusion-xl-diffusers
7
+ - text-to-image
8
+ - diffusers
9
+ - instruct-pix2pix
10
+ inference: false
11
+ datasets:
12
+ - timbrooks/instructpix2pix-clip-filtered
13
+ ---
14
+
15
+ # SDXL InstructPix2Pix (768768)
16
+
17
+ Instruction fine-tuning of [Stable Diffusion XL (SDXL)](https://hf.co/papers/2307.01952) à la [InstructPix2Pix](https://huggingface.co/papers/2211.09800). Some results below:
18
+
19
+
20
+ ## Usage in 🧨 diffusers
21
+
22
+ Make sure to install the libraries first:
23
+
24
+ ```bash
25
+ pip install accelerate transformers
26
+ pip install git+https://github.com/huggingface/diffusers
27
+ ```
28
+
29
+ ```python
30
+ import torch
31
+ from diffusers import StableDiffusionXLInstructPix2PixPipeline
32
+ from diffusers.utils import load_image
33
+
34
+ resolution = 768
35
+ image = load_image(
36
+ "https://hf.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png"
37
+ ).resize((resolution, resolution))
38
+ edit_instruction = "Turn sky into a cloudy one"
39
+
40
+ pipe = StableDiffusionXLInstructPix2PixPipeline.from_pretrained(
41
+ "diffusers/sdxl-instructpix2pix-768", torch_dtype=torch.float16
42
+ ).to("cuda")
43
+
44
+ edited_image = pipe(
45
+ prompt=edit_instruction,
46
+ image=image,
47
+ height=resolution,
48
+ width=resolution,
49
+ guidance_scale=3.0,
50
+ image_guidance_scale=1.5,
51
+ num_inference_steps=30,
52
+ ).images[0]
53
+ edited_image.save("edited_image.png")
54
+ ```
55
+
56
+ To know more, refer to the [documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/pix2pix).
57
+
58
+ 🚨 Note that this checkpoint is experimental in nature and there's a lot of room for improvements. Please use the "Discussions" tab of this repository to open issues and discuss. 🚨
59
+
60
+ ## Training
61
+ We fine-tuned SDXL using the InstructPix2Pix training methodology for 15000 steps using a fixed learning rate of 5e-6 on an image resolution of 768x768.
62
+
63
+ Our training scripts and other utilities can be found [here](https://github.com/sayakpaul/instructpix2pix-sdxl/tree/b9acc91d6ddf1f2aa2f9012b68216deb40e178f3) and they were built on top of our [official training script](https://huggingface.co/docs/diffusers/main/en/training/instructpix2pix).
64
+
65
+ Our training logs are available on Weights and Biases [here](https://wandb.ai/sayakpaul/instruct-pix2pix-sdxl-new/runs/sw53gxmc). Refer to this link for details on all the hyperparameters.
66
+
67
+ ### Training data
68
+ We used this dataset: [timbrooks/instructpix2pix-clip-filtered](https://huggingface.co/datasets/timbrooks/instructpix2pix-clip-filtered).
69
+
70
+ ### Compute
71
+ one 8xA100 machine
72
+
73
+ ### Batch size
74
+ Data parallel with a single gpu batch size of 8 for a total batch size of 32.
75
+
76
+ ### Mixed precision
77
+ FP16