friedrichor
commited on
Commit
•
507fca9
1
Parent(s):
d38fd44
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
fine-tuned with text-image dataset `friedrichor/PhotoChat_120_square_HQ`
|
3 |
+
|
4 |
+
# Model Details
|
5 |
+
|
6 |
+
- Model type: Diffusion-based text-to-image generation model
|
7 |
+
- Language(s): English
|
8 |
+
- fine-tuning dataset: [friedrichor/PhotoChat_120_square_HQ](https://huggingface.co/datasets/friedrichor/PhotoChat_120_square_HQ)
|
9 |
+
|
10 |
+
## Dataset
|
11 |
+
[friedrichor/PhotoChat_120_square_HQ](https://huggingface.co/datasets/friedrichor/PhotoChat_120_square_HQ) was used for fine-tuning Stable Diffusion v2.1.
|
12 |
+
|
13 |
+
120 image-text pairs
|
14 |
+
|
15 |
+
Images were manually screened from the [PhotoChat](https://aclanthology.org/2021.acl-long.479/) dataset, cropped to square, and `Gigapixel` was used to improve the quality.
|
16 |
+
Image captions are generated by [BLIP-2](https://arxiv.org/abs/2301.12597).
|
17 |
+
|
18 |
+
# Simple use example
|
19 |
+
|
20 |
+
```python
|
21 |
+
import torch
|
22 |
+
from diffusers import StableDiffusionPipeline
|
23 |
+
|
24 |
+
device = "cuda:0"
|
25 |
+
pipe = StableDiffusionPipeline.from_pretrained("friedrichor/stable-diffusion-v2.1-portraiture", torch_dtype=torch.float32)
|
26 |
+
pipe.to(device)
|
27 |
+
|
28 |
+
prompt = "a woman in a red and gold costume with feathers on her head"
|
29 |
+
extra_prompt = ", facing the camera, photograph, highly detailed face, depth of field, moody light, style by Yasmin Albatoul, Harry Fayt, centered, extremely detailed, Nikon D850, award winning photography"
|
30 |
+
negative_prompt = "cartoon, anime, ugly, (aged, white beard, black skin, wrinkle:1.1), (bad proportions, unnatural feature, incongruous feature:1.4), (blurry, un-sharp, fuzzy, un-detailed skin:1.2), (facial contortion, poorly drawn face, deformed iris, deformed pupils:1.3), (mutated hands and fingers:1.5), disconnected hands, disconnected limbs"
|
31 |
+
|
32 |
+
generator = torch.Generator(device=device).manual_seed(42)
|
33 |
+
image = pipe(prompt + extra_prompt,
|
34 |
+
negative_prompt=negative_prompt,
|
35 |
+
height=768, width=768,
|
36 |
+
num_inference_steps=20,
|
37 |
+
guidance_scale=7.5,
|
38 |
+
generator=generator).images[0]
|
39 |
+
image.save("image.png")
|
40 |
+
```
|
41 |
+
|