Crosstyan commited on
Commit
d893fb0
1 Parent(s): d001e0c

typo and model card

Browse files
Files changed (1) hide show
  1. README.md +23 -4
README.md CHANGED
@@ -1,14 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # BPModel
2
 
3
  BPModel is an experimental Stable Diffusion model based on [ACertainty](https://huggingface.co/JosephusCheung/ACertainty) from [Joseph Cheung](https://huggingface.co/JosephusCheung).
4
 
5
- Why is the Model even exist? There are loads of Stable Diffusion model out there, especially anime style models.
6
  Well, is there any models trained with resolution base resolution (`base_res`) 768 even 1024 before? Don't think so.
7
  Here it is, the BPModel, a Stable Diffusion model you may love or hate.
8
  Trained with 5k high quality images that suit my taste (not necessary yours unfortunately) from [Sankaku Complex](https://chan.sankakucomplex.com) with annotations. Not the best strategy since pure combination of tags may not be the optimal way to describe the image, but I don't need to do extra work. And no, I won't feed any AI generated image
9
  to the model even it might outlaw the model from being used in some countries.
10
 
11
- The training of a high resolution model requires a significant amount of GPU hours and can be costly. In this particular case, 10 V100 GPU hours were spent on training a model with a resolution of 512, while 60 V100 GPU hours were spent on training a model with a resolution of 768. An additional 50 V100 GPU hours were also spent on training a model with a resolution of 1024, although only 10 epochs were run. The results of the training on the 1024 resolution model did not show a significant improvement compared to the 768 resolution model, and the resource demands, including a batch size of 1 on a V100 with 32G VRAM, were high. However, training on the 768 resolution did yield better results than training on the 512 resolution, and it is worth considering as an option. It is worth noting that Stable Diffusion 2.x also chose to train on a 768 resolution model. However, it may be more efficient to start with training on a 512 resolution model due to the slower training process and the need for additional prior knowledge to speed up the training process when working with a 768 resolution.
12
 
13
  [Mikubill/naifu-diffusion](https://github.com/Mikubill/naifu-diffusion) is used as training script and I also recommend to
14
  checkout [CCRcmcpe/scal-sdt](https://github.com/CCRcmcpe/scal-sdt).
@@ -55,7 +71,7 @@ described in [JosephusCheung/ACertainThing](https://huggingface.co/JosephusCheun
55
  > It does not always stay true to your prompts; it adds irrelevant details, and sometimes these details are highly homogenized.
56
 
57
  BPModel, which has been fine-tuned on a relatively small dataset, is prone
58
- to overfitting. This is not surprising given the size of the dataset, but the
59
  strong prior knowledge of ACertainty (full Danbooru) and Stable Diffusion
60
  (LAION) helps to minimize the impact of overfitting.
61
  However I believe it would perform
@@ -123,7 +139,10 @@ Steps: 40, Sampler: Euler a, CFG scale: 8, Seed: 2668993375, Size: 960x1600, Mod
123
 
124
  ## Usage
125
 
126
- For better performance, it is strongly recommended to use Clip skip (CLIP stop at last layers): 2.
 
 
 
127
 
128
  ## About the Model Name
129
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: creativeml-openrail-m
5
+ tags:
6
+ - stable-diffusion
7
+ - stable-diffusion-diffusers
8
+ - text-to-image
9
+ - diffusers
10
+ inference: true
11
+ widget:
12
+ - text: "1girl with blonde two side up disheveled hair red eyes in black serafuku red ribbon, upper body, simple background, grey background, collarbone"
13
+ example_title: "example 1girl"
14
+ ---
15
+
16
+
17
  # BPModel
18
 
19
  BPModel is an experimental Stable Diffusion model based on [ACertainty](https://huggingface.co/JosephusCheung/ACertainty) from [Joseph Cheung](https://huggingface.co/JosephusCheung).
20
 
21
+ Why is the Model even existing? There are loads of Stable Diffusion model out there, especially anime style models.
22
  Well, is there any models trained with resolution base resolution (`base_res`) 768 even 1024 before? Don't think so.
23
  Here it is, the BPModel, a Stable Diffusion model you may love or hate.
24
  Trained with 5k high quality images that suit my taste (not necessary yours unfortunately) from [Sankaku Complex](https://chan.sankakucomplex.com) with annotations. Not the best strategy since pure combination of tags may not be the optimal way to describe the image, but I don't need to do extra work. And no, I won't feed any AI generated image
25
  to the model even it might outlaw the model from being used in some countries.
26
 
27
+ The training of a high resolution model requires a significant amount of GPU hours and can be costly. In this particular case, 10 V100 GPU hours were spent on training a model with a resolution of 512, while 60 V100 GPU hours were spent on training a model with a resolution of 768. An additional 50 V100 GPU hours were also spent on training a model with a resolution of 1024, although only 10 epochs were run. The results of the training on the 1024 resolution model did not show a significant improvement compared to the 768 resolution model, and the resource demands, achieving a batch size of 1 on a V100 with 32G VRAM, were high. However, training on the 768 resolution did yield better results than training on the 512 resolution, and it is worth considering as an option. It is worth noting that Stable Diffusion 2.x also chose to train on a 768 resolution model. However, it may be more efficient to start with training on a 512 resolution model due to the slower training process and the need for additional prior knowledge to speed up the training process when working with a 768 resolution.
28
 
29
  [Mikubill/naifu-diffusion](https://github.com/Mikubill/naifu-diffusion) is used as training script and I also recommend to
30
  checkout [CCRcmcpe/scal-sdt](https://github.com/CCRcmcpe/scal-sdt).
 
71
  > It does not always stay true to your prompts; it adds irrelevant details, and sometimes these details are highly homogenized.
72
 
73
  BPModel, which has been fine-tuned on a relatively small dataset, is prone
74
+ to overfit inherently. This is not surprising given the size of the dataset, but the
75
  strong prior knowledge of ACertainty (full Danbooru) and Stable Diffusion
76
  (LAION) helps to minimize the impact of overfitting.
77
  However I believe it would perform
 
139
 
140
  ## Usage
141
 
142
+ The [`bp_1024_e10.ckpt`](bp_1024_e10.ckpt) doesn't include any VAE and you should using other popular VAE in the community when using with [AUTOMATIC1111/stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) or you would see the
143
+ LaTeNt SpAcE!
144
+
145
+ For better performance, it is strongly recommended to use Clip skip (CLIP stop at last layers) 2.
146
 
147
  ## About the Model Name
148