gustproof commited on
Commit
83fb0b0
1 Parent(s): d7853bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md CHANGED
@@ -1,3 +1,34 @@
1
  ---
2
  license: agpl-3.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: agpl-3.0
3
  ---
4
+
5
+ # SD1 Style Components (experimental)
6
+
7
+ Style control for Stable Diffusion 1.x anime models
8
+
9
+ ## What is this?
10
+
11
+ It is IP-Adapter, but for (anime) styles. Instead of CLIP image embeddings, the image generation is conditioned on 30-dimensional style embeddings, which can either be extracted from an image(s) or created manually.
12
+
13
+ ## Why?
14
+
15
+ Currently, the main means of style control is through artist tags. This method reasonably raises the concern of style plagiarism.
16
+ By breaking down styles into interpretable components that are present in all artists, direct copying of styles can be avoided.
17
+ Furthermore, new styles can be easily created by manipulating the magnitude of the style components, offering more controllability over stacking artist tags or LoRAs.
18
+
19
+ Additionally, this can be potentially useful for general purpose training, as training with style condition may weaken style leakage into concepts.
20
+ This also serves as a demonstration that image models can be conditioned on arbitrary tensors other than text or images.
21
+ Hopefully, more people can understand that it is not necessary to force conditions that are inherently numerical (aesthetic scores, dates, ...) into text form tags.
22
+
23
+ ## How do I use it?
24
+
25
+ Currently, a [Colab notebook](https://colab.research.google.com/drive/1AKXiHHBAnzbtKyToN6WdzOov-niJudcL?usp=sharing) with a gradio interface is available.
26
+ As this is only an experimental preview, proper support for popular web UIs will not be added before more the models reach a stable state.
27
+
28
+ ## Technical details
29
+ First, a style embedding model is created by Supervised Contrastive Learning on an [artists dataset](https://huggingface.co/datasets/gustproof/artists/blob/main/artists.zip).
30
+ Then, from the learned embeddings, the 30 first components of a PCA are extracted. Finally, a modified IP-Adapter is trained on anime-final-pruned using the same dataset with WD1.4 tags and the projected 30-d embeddings. The training resolution is 576*576 with variable aspect ratios.
31
+
32
+
33
+ ## Acknowledgements
34
+ This is largely inspired by [Inserting Anybody in Diffusion Models via Celeb Basis](http://arxiv.org/abs/2306.00926) and [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter). Training and inference code is modified from IP-Adapter ([license](https://github.com/tencent-ailab/IP-Adapter/blob/main/LICENSE)).