Linaqruf commited on
Commit
f50e351
1 Parent(s): 7e7ce6a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +201 -0
README.md ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: openrail++
3
+ language:
4
+ - en
5
+ pipeline_tag: text-to-image
6
+ tags:
7
+ - stable-diffusion
8
+ - stable-diffusion-diffusers
9
+ - stable-diffusion-xl
10
+ inference: true
11
+ widget:
12
+ - text: >-
13
+ face focus, cute, masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck
14
+ example_title: example 1girl
15
+ - text: >-
16
+ face focus, bishounen, masterpiece, best quality, 1boy, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck
17
+ example_title: example 1boy
18
+ library_name: diffusers
19
+ datasets:
20
+ - Linaqruf/animagine-datasets
21
+ ---
22
+
23
+ <style>
24
+ .title-container {
25
+ display: flex;
26
+ justify-content: center;
27
+ align-items: center;
28
+ height: 100vh; /* Adjust this value to position the title vertically */
29
+ }
30
+ .title {
31
+ font-size: 3em;
32
+ text-align: center;
33
+ color: #333;
34
+ font-family: 'Helvetica Neue', sans-serif;
35
+ text-transform: uppercase;
36
+ letter-spacing: 0.1em;
37
+ padding: 0.5em 0;
38
+ background: transparent;
39
+ }
40
+ .title span {
41
+ background: -webkit-linear-gradient(45deg, #7ed56f, #28b485);
42
+ -webkit-background-clip: text;
43
+ -webkit-text-fill-color: transparent;
44
+ }
45
+ .custom-table {
46
+ table-layout: fixed;
47
+ width: 100%;
48
+ border-collapse: collapse;
49
+ margin-top: 2em;
50
+ }
51
+ .custom-table td {
52
+ width: 50%;
53
+ vertical-align: top;
54
+ padding: 10px;
55
+ box-shadow: 0px 0px 10px 0px rgba(0,0,0,0.15);
56
+ }
57
+ .custom-image {
58
+ width: 100%;
59
+ height: auto;
60
+ object-fit: cover;
61
+ border-radius: 10px;
62
+ transition: transform .2s;
63
+ margin-bottom: 1em;
64
+ }
65
+ .custom-image:hover {
66
+ transform: scale(1.05);
67
+ }
68
+ </style>
69
+
70
+ <h1 class="title"><span>Animagine XL</span></h1>
71
+
72
+ <table class="custom-table">
73
+ <tr>
74
+ <td>
75
+ <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image1.png">
76
+ <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image1.png" alt="sample1">
77
+ </a>
78
+ <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image3.png">
79
+ <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image3.png" alt="sample3">
80
+ </a>
81
+ </td>
82
+ <td>
83
+ <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image2.png">
84
+ <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image2.png" alt="sample2">
85
+ </a>
86
+ <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image4.png">
87
+ <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image4.png" alt="sample4">
88
+ </a>
89
+ </td>
90
+ </tr>
91
+ </table>
92
+
93
+ <hr>
94
+
95
+ ## Overview
96
+
97
+ **Animagine** XL is a high-resolution, latent text-to-image diffusion model. The model has been fine-tuned using a learning rate of `4e-7` over 27000 global steps with a batch size of 16 on a curated dataset of superior-quality anime-style images. This model is derived from Stable Diffusion XL 1.0.
98
+
99
+ - Use it with the [`Stable Diffusion Webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui)
100
+ - Use it with 🧨 [`diffusers`](https://huggingface.co/docs/diffusers/index)
101
+ - Use it with the [`ComfyUI`](https://github.com/comfyanonymous/ComfyUI) **(recommended)**
102
+
103
+ Like other anime-style Stable Diffusion models, it also supports Danbooru tags to generate images.
104
+
105
+ e.g. _**face focus, cute, masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck**_
106
+
107
+
108
+ ## Features
109
+
110
+ 1. High-Resolution Images: The model trained with 1024x1024 resolution. The model is trained using [NovelAI Aspect Ratio Bucketing Tool](https://github.com/NovelAI/novelai-aspect-ratio-bucketing) so that it can be trained at non-square resolutions.
111
+ 2. Anime-styled Generation: Based on given text prompts, the model can create high quality anime-styled images.
112
+ 3. Fine-Tuned Diffusion Process: The model utilizes a fine-tuned diffusion process to ensure high quality and unique image output.
113
+
114
+ <hr>
115
+
116
+ ## Model Details
117
+
118
+ - **Developed by:** [Linaqruf](https://github.com/Linaqruf)
119
+ - **Model type:** Diffusion-based text-to-image generative model
120
+ - **Model Description:** This is a model that can be used to generate and modify high quality anime-themed images based on text prompts.
121
+ - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL)
122
+ - **Finetuned from model:** [Stable Diffusion XL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
123
+
124
+ <hr>
125
+
126
+ ## How to Use:
127
+ - Download `Animagine XL` [here](https://huggingface.co/Linaqruf/animagine-xl/resolve/main/animagine-xl.safetensors), the model is in `.safetensors` format.
128
+ - You need to use Danbooru-style tag as prompt instead of natural language, otherwise you will get realistic result instead of anime
129
+ - You can use any generic negative prompt or use the following suggested negative prompt to guide the model towards high aesthetic generationse:
130
+ ```
131
+ lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
132
+ ```
133
+ - And, the following should also be prepended to prompts to get high aesthetic results:
134
+ ```
135
+ masterpiece, best quality, illustration, beautiful detailed, finely detailed, dramatic light, intricate details
136
+ ```
137
+ - Use this cheat sheet to find the best resolution:
138
+ ```
139
+ 768 x 1344: Vertical (9:16)
140
+ 915 x 1144: Portrait (4:5)
141
+ 1024 x 1024: Square (1:1)
142
+ 1182 x 886: Photo (4:3)
143
+ 1254 x 836: Landscape (3:2)
144
+ 1365 x 768: Widescreen (16:9)
145
+ 1564 x 670: Cinematic (21:9)
146
+ ```
147
+ <hr>
148
+
149
+ ## 🧨 Diffusers
150
+
151
+ Make sure to upgrade diffusers to >= 0.18.2:
152
+ ```
153
+ pip install diffusers --upgrade
154
+ ```
155
+
156
+ In addition make sure to install `transformers`, `safetensors`, `accelerate` as well as the invisible watermark:
157
+ ```
158
+ pip install invisible_watermark transformers accelerate safetensors
159
+ ```
160
+
161
+ Running the pipeline (if you don't swap the scheduler it will run with the default **EulerDiscreteScheduler** in this example we are swapping it to **EulerAncestralDiscreteScheduler**:
162
+ ```py
163
+ import torch
164
+ from torch import autocast
165
+ from diffusers.models import AutoencoderKL
166
+ from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler
167
+
168
+ model = "Linaqruf/animagine-xl"
169
+ vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")
170
+
171
+ pipe = StableDiffusionXLPipeline.from_pretrained(
172
+ model,
173
+ torch_dtype=torch.float16,
174
+ use_safetensors=True,
175
+ variant="fp16",
176
+ vae=vae
177
+ )
178
+
179
+ pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
180
+ pipe.to('cuda')
181
+
182
+ prompt = "face focus, cute, masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck"
183
+ negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"
184
+
185
+ image = pipe(
186
+ prompt,
187
+ negative_prompt=negative_prompt,
188
+ width=1024,
189
+ height=1024,
190
+ guidance_scale=12,
191
+ target_size=(1024,1024),
192
+ original_size=(4096,4096),
193
+ num_inference_steps=50
194
+ ).images[0]
195
+
196
+ image.save("anime_girl.png")
197
+ ```
198
+ <hr>
199
+
200
+ ## Limitation
201
+ This model inherit Stable Diffusion XL 1.0 [limitation](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0#limitations)