xinsir commited on
Commit
d8ad966
1 Parent(s): 818da1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -1
README.md CHANGED
@@ -34,16 +34,26 @@ The control ability is also strong, for example if you are unstatisfied with som
34
 
35
  - **Paper [optional]:** https://arxiv.org/abs/2302.05543
36
 
37
- ### Examples
 
38
  ![image1](./000155_scribble_concat.webp)
 
39
  ![image2](./000186_scribble_concat.webp)
 
40
  ![image3](./000210_scribble_concat.webp)
 
41
  ![image4](./000227_scribble_concat.webp)
 
42
  ![image5](./000242_scribble_concat.webp)
 
43
  ![image6](./000250_scribble_concat.webp)
 
44
  ![image7](./000256_scribble_concat.webp)
 
45
  ![image8](./000271_scribble_concat.webp)
 
46
  ![image9](./000285_scribble_concat.webp)
 
47
  ![image10](./000290_scribble_concat.webp)
48
 
49
 
@@ -157,3 +167,28 @@ images = pipe(
157
  images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
158
  ```
159
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  - **Paper [optional]:** https://arxiv.org/abs/2302.05543
36
 
37
+ ### Examples[**Note the following examples are all generate using stabilityai/stable-diffusion-xl-base-1.0 and xinsir/controlnet-scribble-sdxl-1.0**]
38
+ prompt: purple feathered eagle with specks of light like stars in feathers. It glows with arcane power
39
  ![image1](./000155_scribble_concat.webp)
40
+ prompt: manga girl in the city, drip marketing
41
  ![image2](./000186_scribble_concat.webp)
42
+ prompt: 17 year old girl with long dark hair in the style of realism with fantasy elements, detailed botanical illustrations, barbs and thorns, ethereal, magical, black, purple and maroon, intricate, photorealistic
43
  ![image3](./000210_scribble_concat.webp)
44
+ prompt: a logo for a paintball field named district 7 on a white background featuring paintballs the is bright and colourful eye catching and impactuful
45
  ![image4](./000227_scribble_concat.webp)
46
+ prompt: a photograph of a handsome crying blonde man with his face painted in the pride flag
47
  ![image5](./000242_scribble_concat.webp)
48
+ prompt: simple flat sketch fox play ball
49
  ![image6](./000250_scribble_concat.webp)
50
+ prompt: concept art, a surreal magical Tome of the Sun God, the book binding appears to be made of solar fire and emits a holy, radiant glow, Age of Wonders, Unreal Engine v5
51
  ![image7](./000256_scribble_concat.webp)
52
+ prompt: black Caribbean man walking balance front his fate chaos anarchy liberty independence force energy independence cinematic surreal beautiful rendition intricate sharp detail 8k
53
  ![image8](./000271_scribble_concat.webp)
54
+ prompt: die hard nakatomi plaza, explosion at the top, vector, night scene
55
  ![image9](./000285_scribble_concat.webp)
56
+ prompt: solitary glowing yellow tree in a desert. ultra wide shot. night time. hdr photography
57
  ![image10](./000290_scribble_concat.webp)
58
 
59
 
 
167
  images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
168
  ```
169
 
170
+ ## Evaluation Data
171
+ The test data is randomly sample from midjourney upscale images with prompts, as the purpose of the project is to letting people draw images like midjourney. midjourney’s users include a large number of professional designers,
172
+ and the upscale image tend to have more beauty score and prompt consistency, it is suitable to use it as the test set to judge the ability of controlnet. We select 300 prompt-image pairs randomly and generate 4 images per prompt,
173
+ totally 1200 images generated. We caculate the Laion Aesthetic Score to measure the beauty and the PerceptualSimilarity to measure the control ability, we find the quality of images have a good consistency with the meric values.
174
+ We compare our methods with other SOTA huggingface models and list the result below. We are the models that have highest aesthectic score, and can generate visually appealing images if you prompt it properly.
175
+
176
+ Note: The condition image are generated using HED detector and random threshold to generate different kinds of lines.
177
+
178
+ ## Quantitative Result
179
+ | metric | xinsir/controlnet-scribble-sdxl-1.0 |
180
+ |-------|-------|
181
+ | laion_aesthetic | **6.03** |
182
+ | perceptual similarity | 0.5701 |
183
+
184
+ laion_aesthetic(the higher the better)
185
+ perceptual similarity(the lower the better)
186
+
187
+ Note: The values are caculated when save in webp format, when save in png the aesthetic values will increase 0.1-0.3, but the relative relation remains unchanged.
188
+
189
+ ### Conclusion
190
+
191
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
192
+ In our evaluation, the model can generate visually appealing images using simple sketch and simple prompt. This model can support any type of lines and any width of lines, using thick line will give a coarse control
193
+ which obey the prompt your write more, and using thick line will give a strong control which obey the condition image more. The model can help you complish the drawing from coarse to fine, the model achieves higher
194
+ aesthetic score than xinsir/controlnet-canny-sdxl-1.0, but the control ability will decrease a bit because of thick line.