Guizmus commited on
Commit
c2ee334
β€’
1 Parent(s): 8bd4ad6

adding 3D style (#3)

Browse files

- adding 3D style (f8c222e015d73189f0f9d56e67dd0a7b5f8494c5)
- Delete feature_extractor/preprocessor_config.json (600b633cc0c8256c7e508c625a695ad30f9adb45)
- Delete safety_checker/config.json (3bb7b526b50756a1215fe3d3b93e3c29bb606402)
- Delete safety_checker/pytorch_model.bin (144383ac0e548e6a019adebb03d51118a4d0bdec)
- Delete scheduler/scheduler_config.json (1c711b9d09c2c53b8faea0f1a3f320029bc42f59)
- Delete text_encoder/config.json (f1b1e7d061abcbf0f6ce5187f9815abf5640b69a)
- Delete text_encoder/pytorch_model.bin (36e1cf890820edc7a6b434668e612bc4cbb42615)
- Delete tokenizer/merges.txt (63ba2abe2257eabea389c832a39bd0db4c4f10a5)
- Delete tokenizer/special_tokens_map.json (57fcbbe33e0aead1ee8d80179252fdbdd5769c0b)
- Delete tokenizer/tokenizer_config.json (8e05e5ef62985ce795eb6512ddb6dd2d89f9a189)
- Delete tokenizer/vocab.json (0f9baa7d5016817d91c0393ba9bf9a985e2b6b6e)
- Delete unet/config.json (462281ff8c5ec7b06c907a5d281dd75441cc21be)
- Delete unet/diffusion_pytorch_model.bin (ac78d415853cbbf3b47104b6d306a06a52168f37)
- Delete vae/config.json (f37c02e080501b4eea56d63db81bcbad497ed7ab)
- Delete vae/diffusion_pytorch_model.bin (53e013e1157da9d92cc742034c60af1597f2d1eb)
- Delete model_index.json (8fcaa13d5b0e2606cc862bca8a9a9cdab7e0335a)
- Create images/init.txt (0f4f7999b268c6bd9b2ea785d14d53c8e8d4d904)
- Upload 2 files (128790a3e4638621e3ece88c6c3a84748c506c21)
- Delete showcase_AnimeChan.jpg (5b91f69c8a54d1bee47a5cc4939316d749515ee4)
- Update README.md (0f7f8298a2cc4c3d3344b3a9bc64187287a756e0)
- Delete images/init.txt (3156c46ce2ff8c0c7f99909935c34c4e67cba8b8)
- Create ckpt/init.txt (1b159725b0907d6c0075d04bf011ba0de04a6475)
- Upload 4 files (58331fd0f71784913bdc87c24413f6e195a132b9)
- Delete ckpt/AnimeChanStyle_v1.ckpt (c593296833f1fba536dbbd684669952f96ab492c)
- Upload AnimeChanStyle-v1.ckpt (ac5ca89086851c5cc854bd4c5f258a449b705aac)
- Delete ckpt/init.txt (aaf294b5cdc53795d1d3870063c7ce4ace3c2b45)
- Delete AnimeChan Style.zip (0149387ec67ed1c7b7a7d976c69cedce3a269911)
- Delete AnimeChanStyle-v2.ckpt (b72839865a3747527990a7926f29dba867004861)
- Delete AnimeChanStyle_v1.ckpt (29dc661a31775326d74bc290996373bdac365228)
- Delete showcase.jpg (bf74bb38acbdbaf96c9ad2e906c847c9d919ebb0)
- Create datasets/init.txt (e0f26b18b98065c55460401a2752626cb5c11dd0)
- Upload 3 files (af2f733927dba93cbad1fdddd798c139af16565a)
- Upload 4 files (805427d80e6556475d61bc47bb0a3d091437b6ac)
- Delete images/showcase_AnimeChan.jpg (a489b8d6e659ccf3b3b3da95cdbf11c3c5da8b3f)
- Delete images/showcase_3DChanStyle.jpg (3b66820ffeac2d5b565ddf3efb0042519afe6bf4)
- Delete datasets/init.txt (0d88997f8ee7ad98828a57f8ba0b9ab125c5f5fb)
- Upload README.md (151905fa36904b83eac769cf103ecaf1f2abdf6d)

README.md CHANGED
@@ -2,46 +2,88 @@
2
  language:
3
  - en
4
  license: creativeml-openrail-m
5
- thumbnail: "https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/showcase_AnimeChan.jpg"
6
  tags:
7
  - stable-diffusion
8
  - text-to-image
9
  - image-to-image
10
  library_name: "EveryDream"
11
- datasets:
12
- - Guizmus/AnimeChanStyle
13
  inference: false
14
  ---
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  # AnimeChan Style
17
- <p>
18
- <img alt="Showcase" src="https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/showcase_AnimeChan.jpg"/><br/>
19
- This model was based on <a href="https://huggingface.co/naclbit/trinart_stable_diffusion_v2">Trinart</a> model.<br/>
20
- The dataset was a collaborative effort of the Stable Diffusion #anime channel, made of pictures from the users themselves using their different techniques.<br/>
21
- 100 total pictures in the dataset, 300 repeats total each, over 6 Epoch on LR1e-6.<br/>
22
- This was trained using EveryDream with a full caption of all training pictures. The dataset can be found <a href="https://huggingface.co/datasets/Guizmus/AnimeChanStyle">here</a>.<br/>
23
- <br/>
24
- The style will be called by the use of the token <b>AnimeChan Style</b>.<br/>
25
- <br/>
26
- To access this model, you can download the CKPT file below, or use the <a href="https://huggingface.co/Guizmus/AnimeChanStyle/tree/main">diffusers</a>
27
- </p>
28
 
29
- [current v2 download link](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/AnimeChanStyle-v2.ckpt)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- [dataset for the second version](https://huggingface.co/datasets/Guizmus/AnimeChanStyle)
32
 
33
- ## First version
34
 
35
- ![first version showcase](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/showcase.jpg)
36
 
37
- [first version download link](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/AnimeChanStyle_v1.ckpt)
38
 
39
- [dataset for the first version](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/AnimeChan%20Style.zip)
40
 
41
 
42
- ## License
43
 
44
- This model is open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage.
45
  The CreativeML OpenRAIL License specifies:
46
 
47
  1. You can't use the model to deliberately produce nor share illegal or harmful outputs or content
 
2
  language:
3
  - en
4
  license: creativeml-openrail-m
5
+ thumbnail: "https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/images/showcase_main.jpg"
6
  tags:
7
  - stable-diffusion
8
  - text-to-image
9
  - image-to-image
10
  library_name: "EveryDream"
 
 
11
  inference: false
12
  ---
13
 
14
+ # Introduction
15
+
16
+ This is a collection of models made from and for the users of the Stable Diffusion Discord server. Different categories of channel exist, the "Dreamers Communities" presenting a panel of subjects, like Anime, 3D, or Architectural. Each of these channels has users posting images made through the use of Stable diffusion. After asking the users, and, depending on the activity of each channel, collecting a dataset from new submissions or from the history of the channel, I intend to build multiple models representing the style of each, so that users can produce things in the style they like and mix it with other things more easily.
17
+
18
+ Those are mainly done through the use of EveryDream, and should result in a Mega Model towards the end for the datasets that are compatible. Some model like the Anime one require to stay on a different starting point, and may not get merged.
19
+
20
+
21
+
22
+ # 3DChanStyle Style
23
+
24
+
25
+ ## Dataset & training
26
+
27
+ This model was based on [RunwayML SD 1.5](https://huggingface.co/runwayml/stable-diffusion-v1-5) model with updated VAE.
28
+
29
+ The dataset was a collaborative effort of the Stable Diffusion #3D channel, made of pictures from the users themselves using their different techniques.
30
+
31
+ 120 total pictures in the dataset, 500 repeats total each, over 10 Epoch on LR1e-6.
32
+
33
+ This was trained using EveryDream with a full caption of all training pictures.
34
+
35
+ The style will be called by the use of the token **3D Style**.
36
+
37
+ Other significant tokens : rick roll, fullbody shot, bad cosplay man
38
+
39
+ ## Showcase & Downloads
40
+
41
+ ![Showcase](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/images/showcase_3DChanStyle-v1.jpg)
42
+
43
+ [CKPT (2GB)](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/ckpt/3DStyle-v1.ckpt)
44
+
45
+ [CKPT with training optimizers (11GB)](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/ckpt/3DStyle-v1_with_optimizers.ckpt)
46
+
47
+ [Dataset](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/datasets/3DChanStyle-v1.zip)
48
+
49
+
50
+
51
  # AnimeChan Style
 
 
 
 
 
 
 
 
 
 
 
52
 
53
+ ## Dataset & training
54
+
55
+ This model was based on [Trinart](https://huggingface.co/naclbit/trinart_stable_diffusion_v2) model.
56
+
57
+ The dataset was a collaborative effort of the Stable Diffusion #anime channel, made of pictures from the users themselves using their different techniques.
58
+
59
+ 100 total pictures in the dataset, 300 repeats total each, over 6 Epoch on LR1e-6.
60
+
61
+ This was trained using EveryDream with a full caption of all training pictures.
62
+
63
+ The style will be called by the use of the token **AnimeChan Style**.
64
+
65
+ ## Downloads v2
66
+
67
+ ![Showcase](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/images/showcase_AnimeChan-v2.jpg)
68
+
69
+ [CKPT (2GB)](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/ckpt/AnimeChanStyle-v2.ckpt)
70
+
71
+ [Dataset](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/datasets/AnimeChanStyle-v2.zip)
72
+
73
+ ## Downloads v1
74
 
75
+ ![Showcase](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/images/showcase_AnimeChan-v1.jpg)
76
 
77
+ [CKPT (2GB)](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/ckpt/AnimeChanStyle-v1.ckpt)
78
 
79
+ [Dataset](https://huggingface.co/Guizmus/SD_DreamerCommunities_Collection/resolve/main/datasets/AnimeChanStyle-v1.zip)
80
 
 
81
 
 
82
 
83
 
84
+ # License
85
 
86
+ These models are open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage.
87
  The CreativeML OpenRAIL License specifies:
88
 
89
  1. You can't use the model to deliberately produce nor share illegal or harmful outputs or content
unet/diffusion_pytorch_model.bin β†’ ckpt/3DStyle-v1.ckpt RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a8ab275da85b3ff348e1e4a60cda864dc14acbb0676dd3f813160913c29bd740
3
- size 3438366373
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9698c46f614733b95d75cae29e777be15b2f13758f92d8eb113d566d79b665a6
3
+ size 2132888989
safety_checker/pytorch_model.bin β†’ ckpt/3DStyle-v1_with_optimizers.ckpt RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:16d28f2b37109f222cdc33620fdd262102ac32112be0352a7f77e9614b35a394
3
- size 1216064769
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8bca5d3ef403ad1ecfa7d3c9c5717a3bd5fa7899f93cd5526e4d55bbbd380147
3
+ size 12126930715
AnimeChanStyle_v1.ckpt β†’ ckpt/AnimeChanStyle-v1.ckpt RENAMED
File without changes
AnimeChanStyle-v2.ckpt β†’ ckpt/AnimeChanStyle-v2.ckpt RENAMED
File without changes
text_encoder/pytorch_model.bin β†’ datasets/3DChanStyle-v1.zip RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a2d9696908f0799f233577504bd321350000c0ccf236df5f9ebb7561a24af46e
3
- size 492307041
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0b659800b6b5140f8e953be05d7acd6d02296dbcbbd43dae944af84b2c5d131a
3
+ size 29246736
AnimeChan Style.zip β†’ datasets/AnimeChanStyle-v1.zip RENAMED
File without changes
vae/diffusion_pytorch_model.bin β†’ datasets/AnimeChanStyle-v2.zip RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6723bacd3c60b11a2b4e6007338a54c6964c210116c3ccecb3bfc80e218afc8f
3
- size 334711857
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:09f5678e615988ce35d01ce9980a546568e27ec52065d2578c265275b070ea27
3
+ size 98956573
feature_extractor/preprocessor_config.json DELETED
@@ -1,20 +0,0 @@
1
- {
2
- "crop_size": 224,
3
- "do_center_crop": true,
4
- "do_convert_rgb": true,
5
- "do_normalize": true,
6
- "do_resize": true,
7
- "feature_extractor_type": "CLIPFeatureExtractor",
8
- "image_mean": [
9
- 0.48145466,
10
- 0.4578275,
11
- 0.40821073
12
- ],
13
- "image_std": [
14
- 0.26862954,
15
- 0.26130258,
16
- 0.27577711
17
- ],
18
- "resample": 3,
19
- "size": 224
20
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
images/showcase_3DChanStyle-v1.jpg ADDED
showcase.jpg β†’ images/showcase_AnimeChan-v1.jpg RENAMED
File without changes
showcase_AnimeChan.jpg β†’ images/showcase_AnimeChan-v2.jpg RENAMED
File without changes
images/showcase_main.jpg ADDED
model_index.json DELETED
@@ -1,32 +0,0 @@
1
- {
2
- "_class_name": "StableDiffusionPipeline",
3
- "_diffusers_version": "0.8.0.dev0",
4
- "feature_extractor": [
5
- "transformers",
6
- "CLIPFeatureExtractor"
7
- ],
8
- "safety_checker": [
9
- "stable_diffusion",
10
- "StableDiffusionSafetyChecker"
11
- ],
12
- "scheduler": [
13
- "diffusers",
14
- "PNDMScheduler"
15
- ],
16
- "text_encoder": [
17
- "transformers",
18
- "CLIPTextModel"
19
- ],
20
- "tokenizer": [
21
- "transformers",
22
- "CLIPTokenizer"
23
- ],
24
- "unet": [
25
- "diffusers",
26
- "UNet2DConditionModel"
27
- ],
28
- "vae": [
29
- "diffusers",
30
- "AutoencoderKL"
31
- ]
32
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
safety_checker/config.json DELETED
@@ -1,179 +0,0 @@
1
- {
2
- "_commit_hash": "4bb648a606ef040e7685bde262611766a5fdd67b",
3
- "_name_or_path": "CompVis/stable-diffusion-safety-checker",
4
- "architectures": [
5
- "StableDiffusionSafetyChecker"
6
- ],
7
- "initializer_factor": 1.0,
8
- "logit_scale_init_value": 2.6592,
9
- "model_type": "clip",
10
- "projection_dim": 768,
11
- "text_config": {
12
- "_name_or_path": "",
13
- "add_cross_attention": false,
14
- "architectures": null,
15
- "attention_dropout": 0.0,
16
- "bad_words_ids": null,
17
- "begin_suppress_tokens": null,
18
- "bos_token_id": 0,
19
- "chunk_size_feed_forward": 0,
20
- "cross_attention_hidden_size": null,
21
- "decoder_start_token_id": null,
22
- "diversity_penalty": 0.0,
23
- "do_sample": false,
24
- "dropout": 0.0,
25
- "early_stopping": false,
26
- "encoder_no_repeat_ngram_size": 0,
27
- "eos_token_id": 2,
28
- "exponential_decay_length_penalty": null,
29
- "finetuning_task": null,
30
- "forced_bos_token_id": null,
31
- "forced_eos_token_id": null,
32
- "hidden_act": "quick_gelu",
33
- "hidden_size": 768,
34
- "id2label": {
35
- "0": "LABEL_0",
36
- "1": "LABEL_1"
37
- },
38
- "initializer_factor": 1.0,
39
- "initializer_range": 0.02,
40
- "intermediate_size": 3072,
41
- "is_decoder": false,
42
- "is_encoder_decoder": false,
43
- "label2id": {
44
- "LABEL_0": 0,
45
- "LABEL_1": 1
46
- },
47
- "layer_norm_eps": 1e-05,
48
- "length_penalty": 1.0,
49
- "max_length": 20,
50
- "max_position_embeddings": 77,
51
- "min_length": 0,
52
- "model_type": "clip_text_model",
53
- "no_repeat_ngram_size": 0,
54
- "num_attention_heads": 12,
55
- "num_beam_groups": 1,
56
- "num_beams": 1,
57
- "num_hidden_layers": 12,
58
- "num_return_sequences": 1,
59
- "output_attentions": false,
60
- "output_hidden_states": false,
61
- "output_scores": false,
62
- "pad_token_id": 1,
63
- "prefix": null,
64
- "problem_type": null,
65
- "pruned_heads": {},
66
- "remove_invalid_values": false,
67
- "repetition_penalty": 1.0,
68
- "return_dict": true,
69
- "return_dict_in_generate": false,
70
- "sep_token_id": null,
71
- "suppress_tokens": null,
72
- "task_specific_params": null,
73
- "temperature": 1.0,
74
- "tf_legacy_loss": false,
75
- "tie_encoder_decoder": false,
76
- "tie_word_embeddings": true,
77
- "tokenizer_class": null,
78
- "top_k": 50,
79
- "top_p": 1.0,
80
- "torch_dtype": null,
81
- "torchscript": false,
82
- "transformers_version": "4.24.0",
83
- "typical_p": 1.0,
84
- "use_bfloat16": false,
85
- "vocab_size": 49408
86
- },
87
- "text_config_dict": {
88
- "hidden_size": 768,
89
- "intermediate_size": 3072,
90
- "num_attention_heads": 12,
91
- "num_hidden_layers": 12
92
- },
93
- "torch_dtype": "float32",
94
- "transformers_version": null,
95
- "vision_config": {
96
- "_name_or_path": "",
97
- "add_cross_attention": false,
98
- "architectures": null,
99
- "attention_dropout": 0.0,
100
- "bad_words_ids": null,
101
- "begin_suppress_tokens": null,
102
- "bos_token_id": null,
103
- "chunk_size_feed_forward": 0,
104
- "cross_attention_hidden_size": null,
105
- "decoder_start_token_id": null,
106
- "diversity_penalty": 0.0,
107
- "do_sample": false,
108
- "dropout": 0.0,
109
- "early_stopping": false,
110
- "encoder_no_repeat_ngram_size": 0,
111
- "eos_token_id": null,
112
- "exponential_decay_length_penalty": null,
113
- "finetuning_task": null,
114
- "forced_bos_token_id": null,
115
- "forced_eos_token_id": null,
116
- "hidden_act": "quick_gelu",
117
- "hidden_size": 1024,
118
- "id2label": {
119
- "0": "LABEL_0",
120
- "1": "LABEL_1"
121
- },
122
- "image_size": 224,
123
- "initializer_factor": 1.0,
124
- "initializer_range": 0.02,
125
- "intermediate_size": 4096,
126
- "is_decoder": false,
127
- "is_encoder_decoder": false,
128
- "label2id": {
129
- "LABEL_0": 0,
130
- "LABEL_1": 1
131
- },
132
- "layer_norm_eps": 1e-05,
133
- "length_penalty": 1.0,
134
- "max_length": 20,
135
- "min_length": 0,
136
- "model_type": "clip_vision_model",
137
- "no_repeat_ngram_size": 0,
138
- "num_attention_heads": 16,
139
- "num_beam_groups": 1,
140
- "num_beams": 1,
141
- "num_channels": 3,
142
- "num_hidden_layers": 24,
143
- "num_return_sequences": 1,
144
- "output_attentions": false,
145
- "output_hidden_states": false,
146
- "output_scores": false,
147
- "pad_token_id": null,
148
- "patch_size": 14,
149
- "prefix": null,
150
- "problem_type": null,
151
- "pruned_heads": {},
152
- "remove_invalid_values": false,
153
- "repetition_penalty": 1.0,
154
- "return_dict": true,
155
- "return_dict_in_generate": false,
156
- "sep_token_id": null,
157
- "suppress_tokens": null,
158
- "task_specific_params": null,
159
- "temperature": 1.0,
160
- "tf_legacy_loss": false,
161
- "tie_encoder_decoder": false,
162
- "tie_word_embeddings": true,
163
- "tokenizer_class": null,
164
- "top_k": 50,
165
- "top_p": 1.0,
166
- "torch_dtype": null,
167
- "torchscript": false,
168
- "transformers_version": "4.24.0",
169
- "typical_p": 1.0,
170
- "use_bfloat16": false
171
- },
172
- "vision_config_dict": {
173
- "hidden_size": 1024,
174
- "intermediate_size": 4096,
175
- "num_attention_heads": 16,
176
- "num_hidden_layers": 24,
177
- "patch_size": 14
178
- }
179
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
scheduler/scheduler_config.json DELETED
@@ -1,12 +0,0 @@
1
- {
2
- "_class_name": "PNDMScheduler",
3
- "_diffusers_version": "0.8.0.dev0",
4
- "beta_end": 0.012,
5
- "beta_schedule": "scaled_linear",
6
- "beta_start": 0.00085,
7
- "num_train_timesteps": 1000,
8
- "set_alpha_to_one": false,
9
- "skip_prk_steps": true,
10
- "steps_offset": 1,
11
- "trained_betas": null
12
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
text_encoder/config.json DELETED
@@ -1,25 +0,0 @@
1
- {
2
- "_name_or_path": "openai/clip-vit-large-patch14",
3
- "architectures": [
4
- "CLIPTextModel"
5
- ],
6
- "attention_dropout": 0.0,
7
- "bos_token_id": 0,
8
- "dropout": 0.0,
9
- "eos_token_id": 2,
10
- "hidden_act": "quick_gelu",
11
- "hidden_size": 768,
12
- "initializer_factor": 1.0,
13
- "initializer_range": 0.02,
14
- "intermediate_size": 3072,
15
- "layer_norm_eps": 1e-05,
16
- "max_position_embeddings": 77,
17
- "model_type": "clip_text_model",
18
- "num_attention_heads": 12,
19
- "num_hidden_layers": 12,
20
- "pad_token_id": 1,
21
- "projection_dim": 768,
22
- "torch_dtype": "float32",
23
- "transformers_version": "4.24.0",
24
- "vocab_size": 49408
25
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tokenizer/merges.txt DELETED
The diff for this file is too large to render. See raw diff
 
tokenizer/special_tokens_map.json DELETED
@@ -1,24 +0,0 @@
1
- {
2
- "bos_token": {
3
- "content": "<|startoftext|>",
4
- "lstrip": false,
5
- "normalized": true,
6
- "rstrip": false,
7
- "single_word": false
8
- },
9
- "eos_token": {
10
- "content": "<|endoftext|>",
11
- "lstrip": false,
12
- "normalized": true,
13
- "rstrip": false,
14
- "single_word": false
15
- },
16
- "pad_token": "<|endoftext|>",
17
- "unk_token": {
18
- "content": "<|endoftext|>",
19
- "lstrip": false,
20
- "normalized": true,
21
- "rstrip": false,
22
- "single_word": false
23
- }
24
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tokenizer/tokenizer_config.json DELETED
@@ -1,34 +0,0 @@
1
- {
2
- "add_prefix_space": false,
3
- "bos_token": {
4
- "__type": "AddedToken",
5
- "content": "<|startoftext|>",
6
- "lstrip": false,
7
- "normalized": true,
8
- "rstrip": false,
9
- "single_word": false
10
- },
11
- "do_lower_case": true,
12
- "eos_token": {
13
- "__type": "AddedToken",
14
- "content": "<|endoftext|>",
15
- "lstrip": false,
16
- "normalized": true,
17
- "rstrip": false,
18
- "single_word": false
19
- },
20
- "errors": "replace",
21
- "model_max_length": 77,
22
- "name_or_path": "openai/clip-vit-large-patch14",
23
- "pad_token": "<|endoftext|>",
24
- "special_tokens_map_file": "./special_tokens_map.json",
25
- "tokenizer_class": "CLIPTokenizer",
26
- "unk_token": {
27
- "__type": "AddedToken",
28
- "content": "<|endoftext|>",
29
- "lstrip": false,
30
- "normalized": true,
31
- "rstrip": false,
32
- "single_word": false
33
- }
34
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tokenizer/vocab.json DELETED
The diff for this file is too large to render. See raw diff
 
unet/config.json DELETED
@@ -1,36 +0,0 @@
1
- {
2
- "_class_name": "UNet2DConditionModel",
3
- "_diffusers_version": "0.8.0.dev0",
4
- "act_fn": "silu",
5
- "attention_head_dim": 8,
6
- "block_out_channels": [
7
- 320,
8
- 640,
9
- 1280,
10
- 1280
11
- ],
12
- "center_input_sample": false,
13
- "cross_attention_dim": 768,
14
- "down_block_types": [
15
- "CrossAttnDownBlock2D",
16
- "CrossAttnDownBlock2D",
17
- "CrossAttnDownBlock2D",
18
- "DownBlock2D"
19
- ],
20
- "downsample_padding": 1,
21
- "flip_sin_to_cos": true,
22
- "freq_shift": 0,
23
- "in_channels": 4,
24
- "layers_per_block": 2,
25
- "mid_block_scale_factor": 1,
26
- "norm_eps": 1e-05,
27
- "norm_num_groups": 32,
28
- "out_channels": 4,
29
- "sample_size": 32,
30
- "up_block_types": [
31
- "UpBlock2D",
32
- "CrossAttnUpBlock2D",
33
- "CrossAttnUpBlock2D",
34
- "CrossAttnUpBlock2D"
35
- ]
36
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
vae/config.json DELETED
@@ -1,29 +0,0 @@
1
- {
2
- "_class_name": "AutoencoderKL",
3
- "_diffusers_version": "0.8.0.dev0",
4
- "act_fn": "silu",
5
- "block_out_channels": [
6
- 128,
7
- 256,
8
- 512,
9
- 512
10
- ],
11
- "down_block_types": [
12
- "DownEncoderBlock2D",
13
- "DownEncoderBlock2D",
14
- "DownEncoderBlock2D",
15
- "DownEncoderBlock2D"
16
- ],
17
- "in_channels": 3,
18
- "latent_channels": 4,
19
- "layers_per_block": 2,
20
- "norm_num_groups": 32,
21
- "out_channels": 3,
22
- "sample_size": 256,
23
- "up_block_types": [
24
- "UpDecoderBlock2D",
25
- "UpDecoderBlock2D",
26
- "UpDecoderBlock2D",
27
- "UpDecoderBlock2D"
28
- ]
29
- }