FoodDesert
/

Boring_Embeddings

textual inversion embeddings

image-generation

stable diffusion

AI Art

Model card Files Files and versions Community

FoodDesert commited on Jun 12, 2023

Commit

f82c848

•

1 Parent(s): 4a59959

Update README.md

Browse files

Files changed (1) hide show

README.md +5 -20

README.md CHANGED Viewed

@@ -22,11 +22,11 @@ depend on manually curated lists of tags describing features people do not want
 * Manually compiled lists will inevitably be incomplete.
 * Models might not always understand the tags well due to a dearth of training images labeled with these tags.
 * It can only capture named concepts.  If there exist unnamed yet visually unappealing concepts that just make an image look wrong,
-* but for reasons that cannot be succinctly explained, they will not be captured by a list of tags.
 <br>
 To address these problems, boring_e621 employs textual inversion on a set of images automatically extracted from the art site
-e621.net, a rich resource of millions of hand-labeled artworks, each of which is both hand-labeled topically and rated
 according to its quality.  E621.net allows users to express their approval of an artwork by either up-voting it, or marking it as a favorite.
 Boring_e621 was specifically trained artworks automatically selected from the site according to the criteria
 that no user has ever Favorited or Up-Voted them.  boring_e621 thus learned to produce low-quality images, so when it is
@@ -41,26 +41,11 @@ used in the negative prompt of a stable diffusion image generator, the model avo
 # Evaluation
-I extracted the tags from three e621 images and used them to construct a set of test prompts.
-* one prompt was constructed from an image with a high number of favorites.
-* one prompt was constructed from an image with a moderate number of favorites.
-* one prompt was constructed from an image with 0 favorites.
-<br>
-I then generated test images from each of these prompts, each time using a different negative embedding as the negative prompt.  Particularly, I tried:
-* [EasyNegative](https://huggingface.co/datasets/gsdf/EasyNegative)
-* [Bad Artist](https://huggingface.co/nick-x-hacker/bad-artist)
-* [Bad Prompt](https://huggingface.co/datasets/Nerfgun3/bad_prompt)
-* [boring_e621](this)
-<br>
-Finally, I qualitatively evaluated the attractiveness and interestingness of the resulting images, though I will let you draw your own conclusions from the output below.
-<br>
-## Results
-![Negative Embedding Comparison](https://i.imgur.com/d7R4gGi.jpg)
 ## Other Models

 * Manually compiled lists will inevitably be incomplete.
 * Models might not always understand the tags well due to a dearth of training images labeled with these tags.
 * It can only capture named concepts.  If there exist unnamed yet visually unappealing concepts that just make an image look wrong,
+  but for reasons that cannot be succinctly explained, they will not be captured by a list of tags.
 <br>
 To address these problems, boring_e621 employs textual inversion on a set of images automatically extracted from the art site
+e621.net, a rich resource of millions of hand-labeled artworks, each of which is both human-labeled topically and rated
 according to its quality.  E621.net allows users to express their approval of an artwork by either up-voting it, or marking it as a favorite.
 Boring_e621 was specifically trained artworks automatically selected from the site according to the criteria
 that no user has ever Favorited or Up-Voted them.  boring_e621 thus learned to produce low-quality images, so when it is
 # Evaluation
+To qualitatively evaluate how well boring_e621 has learned to improve image quality, we apply it to 4 simple sample prompts using the base Stable Diffusion 1.5 model.
+[boring_e621 and boring_e621_v4 Performance on Simple Prompts](tmpoqs1d_vv.png)
+As we can see, putting these embeddings in the negative prompt yields a more delicious burger, a more vibrant and detailed landscape, a prettier pharoah, and a more 3-d-looking aquarium.
 ## Other Models