FoodDesert commited on
Commit
cc08afa
1 Parent(s): aed875f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -14
README.md CHANGED
@@ -12,12 +12,13 @@ license: apache-2.0
12
 
13
  These Stable Diffusion embeddings capture what it means for an image to be uninteresting.
14
  This is useful because it allows you to instruct your model NOT to produce images that look uninteresting.
15
- If you're using the [Automatic1111 Stable Diffusion WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui), just download one of the pt files into your stable-diffusion-webui\embeddings directory and use
 
16
  the embedding's name in your NEGATIVE prompt for more interesting outputs.
17
  <table>
18
  <tr>
19
- <td style="text-align: center;"><div style="text-align: center;"><img src="boring_folder.png" alt="Download a .pt file to stable-diffusion-webui\embeddings" style="max-height: 250px; display: inline-block;"></div></td>
20
- <td style="text-align: center;"><div style="text-align: center;"><img src="boring_automatic1111_interface.png" alt="Type the embedding's name (without the .pt extension) in your negative prompt" style="max-height: 250px; display: inline-block;"></div></td>
21
  </tr>
22
  <tr>
23
  <td style="text-align: center;"><strong style="font-size: larger;">Download a .pt file to stable-diffusion-webui\embeddings</strong></td>
@@ -30,30 +31,32 @@ the embedding's name in your NEGATIVE prompt for more interesting outputs.
30
 
31
  ## Model Description
32
 
33
- The motivation for boring_e621 is that negative embeddings like [Bad Prompt](https://huggingface.co/datasets/Nerfgun3/bad_prompt),
34
  whose training is described [here](https://www.reddit.com/r/StableDiffusion/comments/yy2i5a/i_created_a_negative_embedding_textual_inversion/)
35
  depend on manually curated lists of tags describing features people do not want their images to have, such as "deformed hands". Some problems with this approach are:
36
  * Manually compiled lists will inevitably be incomplete.
37
  * Models might not always understand the tags well due to a dearth of training images labeled with these tags.
38
  * It can only capture named concepts. If there exist unnamed yet visually unappealing concepts that just make an image look wrong, but for reasons that cannot be succinctly explained, they will not be captured by a list of tags.
39
 
40
- To address these problems, boring_e621 employs textual inversion on a set of images automatically extracted from the art site
41
- e621.net, a rich resource of millions of hand-labeled artworks, each of which is both human-labeled topically and rated
42
- according to its quality. E621.net allows users to express their approval of an artwork by either up-voting it, or marking it as a favorite.
43
- Boring_e621 was specifically trained on artworks automatically selected from the site according to the criteria
44
- that no user has ever Favorited or Up-Voted them. boring_e621 thus learned to produce low-quality images, so when it is
45
- used in the negative prompt of a stable diffusion image generator, the model avoids making mistakes that would make the generation more boring.
 
46
  <br>
47
 
 
48
  # Bias, Risks, and Limitations
49
- * Using this as a negative embedding often sacrifices some fidelity to the prompt. For example, characters in the image may disappear or change eye/skin color.
50
- * Using this as a negative embedding may introduce unexpected or undesired content into the image to make it look less boring.
51
- * Unlike other negative embeddings, this is not intended to fix problems like extra limbs or deformed hands. It can be used alongside other negative embeddings to fix deformities.
52
  <br>
53
 
54
  # Evaluation
55
 
56
- To qualitatively evaluate how well boring_e621 has learned to improve image quality, we apply it to 4 simple sample prompts using the base Stable Diffusion 1.5 model.
57
 
58
  ![boring_e621 and boring_e621_v4 Performance on Simple Prompts](tmpoqs1d_vv.png)
59
 
 
12
 
13
  These Stable Diffusion embeddings capture what it means for an image to be uninteresting.
14
  This is useful because it allows you to instruct your model NOT to produce images that look uninteresting.
15
+ If you're using the [Automatic1111 Stable Diffusion WebUI](https://github.com/AUTOMATIC1111/stable-diffusion-webui),
16
+ just download one of the pt files into your stable-diffusion-webui\embeddings directory and use
17
  the embedding's name in your NEGATIVE prompt for more interesting outputs.
18
  <table>
19
  <tr>
20
+ <td style="text-align: center;"><div style="text-align: center;"><img src="boring_folder.png" alt="Download a .pt file to stable-diffusion-webui\embeddings" style="max-height: 220px; display: inline-block;"></div></td>
21
+ <td style="text-align: center;"><div style="text-align: center;"><img src="boring_automatic1111_interface.png" alt="Type the embedding's name (without the .pt extension) in your negative prompt" style="max-height: 220px; display: inline-block;"></div></td>
22
  </tr>
23
  <tr>
24
  <td style="text-align: center;"><strong style="font-size: larger;">Download a .pt file to stable-diffusion-webui\embeddings</strong></td>
 
31
 
32
  ## Model Description
33
 
34
+ The motivation for Boring Embeddings is that negative embeddings like [Bad Prompt](https://huggingface.co/datasets/Nerfgun3/bad_prompt),
35
  whose training is described [here](https://www.reddit.com/r/StableDiffusion/comments/yy2i5a/i_created_a_negative_embedding_textual_inversion/)
36
  depend on manually curated lists of tags describing features people do not want their images to have, such as "deformed hands". Some problems with this approach are:
37
  * Manually compiled lists will inevitably be incomplete.
38
  * Models might not always understand the tags well due to a dearth of training images labeled with these tags.
39
  * It can only capture named concepts. If there exist unnamed yet visually unappealing concepts that just make an image look wrong, but for reasons that cannot be succinctly explained, they will not be captured by a list of tags.
40
 
41
+ To address these problems, we employ textual inversion on a set of images automatically extracted from popular art sites
42
+ such as e621.net, derpibooru.org, and danbooru.donmai.us. Each of these sites is a rich resource of millions of
43
+ hand-labeled artworks which allow users to express their approval of an artwork by either up-voting it or marking it as a favorite.
44
+ The Boring embeddings were specifically trained on artworks automatically selected from these sites according to the criteria
45
+ that no user has ever favorited them, and they have 0 or only a very small number of up or down votes. The Boring embeddings
46
+ thus learned to produce uninteresting low-quality images, so when they are used in the negative prompt of a stable diffusion image generator,
47
+ the model avoids making mistakes that would make the generation more boring.
48
  <br>
49
 
50
+
51
  # Bias, Risks, and Limitations
52
+ * Using these negative embeddings sacrifices some fidelity to the prompt in exchange for improved overall quality. For example, characters in the image may disappear or change eye/skin color.
53
+ * Using these negative embeddings may introduce unexpected or undesired content into the image to make it look less boring.
54
+ * Unlike other negative embeddings, the Boring embeddings are not intended to fix problems like extra limbs or deformed hands. They can be used alongside other negative embeddings to fix deformities.
55
  <br>
56
 
57
  # Evaluation
58
 
59
+ To qualitatively evaluate how well the Boring embeddings have learned to improve image quality, we apply them to simple sample prompts using the base Stable Diffusion 1.5 model.
60
 
61
  ![boring_e621 and boring_e621_v4 Performance on Simple Prompts](tmpoqs1d_vv.png)
62