AdamOswald1 commited on Jan 18, 2023

Commit

0148e29

•

1 Parent(s): 7f31634

Upload 16 files

Browse files

Files changed (16) hide show

.gitattributes +10 -0
README.md +251 -0
feature_extractor/preprocessor_config.json +28 -0
model_index.json +33 -0
safety_checker/config.json +181 -0
samples/1boy.png +3 -0
samples/1girl.png +3 -0
samples/scenery.png +3 -0
scheduler/scheduler_config.json +14 -0
text_encoder/config.json +25 -0
tokenizer/merges.txt +0 -0
tokenizer/special_tokens_map.json +24 -0
tokenizer/tokenizer_config.json +48 -0
tokenizer/vocab.json +0 -0
unet/config.json +44 -0
vae/config.json +29 -0

.gitattributes CHANGED Viewed

@@ -32,3 +32,13 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+example-1.png filter=lfs diff=lfs merge=lfs -text
+example-2.png filter=lfs diff=lfs merge=lfs -text
+example-3.png filter=lfs diff=lfs merge=lfs -text
+example-4.png filter=lfs diff=lfs merge=lfs -text
+samples/1girl.png filter=lfs diff=lfs merge=lfs -text
+samples/scenery.png filter=lfs diff=lfs merge=lfs -text
+samples/1boy.png filter=lfs diff=lfs merge=lfs -text
+1girl.png filter=lfs diff=lfs merge=lfs -text
+1boy.png filter=lfs diff=lfs merge=lfs -text
+scenery.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,254 @@
 ---
 license: creativeml-openrail-m
 ---

 ---
+language:
+- en
 license: creativeml-openrail-m
+pipeline_tag: text-to-image
+tags:
+- stable-diffusion
+- stable-diffusion-diffusers
+- text-to-image
+- diffusers
+inference: true
+widget:
+  - text: >-
+      masterpiece, best quality, 1girl, brown hair, green eyes, colorful,
+      autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden
+    example_title: example 1girl
+  - text: >-
+      masterpiece, best quality, 1boy, medium hair, blonde hair, blue eyes, bishounen, colorful,
+      autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden
+    example_title: example 1boy
 ---
+<font color="grey">Thanks to [Linaqruf](https://huggingface.co/Linaqruf) for letting me borrow his model card for reference.
+# Anything V4
+Welcome to Anything V4 - a latent diffusion model for weebs. The newest version of Anything. This model is intended to produce high-quality, highly detailed anime style with just a few prompts. Like other anime-style Stable Diffusion models, it also supports danbooru tags to generate images.
+e.g. **_1girl, white hair, golden eyes, beautiful eyes, detail, flower meadow, cumulonimbus clouds, lighting, detailed sky, garden_**
+I think the V4.5 version better though, it's in this repo. feel free 2 try it
+# Gradio
+We support a [Gradio](https://github.com/gradio-app/gradio) Web UI to run anything-v4.0:
+[![Open In Spaces](https://camo.githubusercontent.com/00380c35e60d6b04be65d3d94a58332be5cc93779f630bcdfc18ab9a3a7d3388/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f25463025394625413425393725323048756767696e67253230466163652d5370616365732d626c7565)](https://huggingface.co/spaces/akhaliq/anything-v4.0)
+## 🧨 Diffusers
+This model can be used just like any other Stable Diffusion model. For more information,
+please have a look at the [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion).
+You can also export the model to [ONNX](https://huggingface.co/docs/diffusers/optimization/onnx), [MPS](https://huggingface.co/docs/diffusers/optimization/mps) and/or [FLAX/JAX]().
+```python
+from diffusers import StableDiffusionPipeline
+import torch
+model_id = "andite/anything-v4.0"
+pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
+pipe = pipe.to("cuda")
+prompt = "hatsune_miku"
+image = pipe(prompt).images[0]
+image.save("./hatsune_miku.png")
+```
+## Examples
+Below are some examples of images generated using this model:
+**Anime Girl:**
+![Anime Girl](https://huggingface.co/andite/anything-v4.0/resolve/main/example-1.png)
+```
+masterpiece, best quality, 1girl, white hair, medium hair, cat ears, closed eyes, looking at viewer, :3, cute, scarf, jacket, outdoors, streets
+Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7
+```
+**Anime Boy:**
+![Anime Boy](https://huggingface.co/andite/anything-v4.0/resolve/main/example-2.png)
+```
+1boy, bishounen, casual, indoors, sitting, coffee shop, bokeh
+Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7
+```
+**Scenery:**
+![Scenery](https://huggingface.co/andite/anything-v4.0/resolve/main/example-4.png)
+```
+scenery, village, outdoors, sky, clouds
+Steps: 50, Sampler: DPM++ 2S a Karras, CFG scale: 7
+```
+## License
+This model is open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage.
+The CreativeML OpenRAIL License specifies:
+1. You can't use the model to deliberately produce nor share illegal or harmful outputs or content
+2. The authors claims no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license
+3. You may re-distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL-M to all your users (please read the license entirely and carefully)
+[Please read the full license here](https://huggingface.co/spaces/CompVis/stable-diffusion-license)
+## Big Thanks to
+- [Linaqruf](https://huggingface.co/Linaqruf). [NoCrypt](https://huggingface.co/NoCrypt), and Fannovel16#9022 for helping me out alot regarding my inquiries and concern about models and other stuff.
+# Anything V3 - Better VAE
+Welcome to Anything V3 - Better VAE. It currently has three model formats: diffusers, ckpt, and safetensors. You'll never see a grey image result again. This model is designed to produce high-quality, highly detailed anime-style images with just a few prompts. Like other anime-style Stable Diffusion models, it also supports danbooru tags for image generation.
+e.g. **_1girl, white hair, golden eyes, beautiful eyes, detail, flower meadow, cumulonimbus clouds, lighting, detailed sky, garden_**
+## Gradio
+We support a [Gradio](https://github.com/gradio-app/gradio) Web UI to run Anything V3 with Better VAE:
+[![Open In Spaces](https://camo.githubusercontent.com/00380c35e60d6b04be65d3d94a58332be5cc93779f630bcdfc18ab9a3a7d3388/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f25463025394625413425393725323048756767696e67253230466163652d5370616365732d626c7565)](https://huggingface.co/spaces/Linaqruf/Linaqruf-anything-v3-better-vae)
+## 🧨 Diffusers
+This model can be used just like any other Stable Diffusion model. For more information,
+please have a look at the [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion). You can also export the model to [ONNX](https://huggingface.co/docs/diffusers/optimization/onnx), [MPS](https://huggingface.co/docs/diffusers/optimization/mps) and/or [FLAX/JAX]().
+You should install dependencies below in order to running the pipeline
+```bash
+pip install diffusers transformers accelerate scipy safetensors
+```
+Running the pipeline (if you don't swap the scheduler it will run with the default DDIM, in this example we are swapping it to DPMSolverMultistepScheduler):
+```python
+from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
+model_id = "Linaqruf/anything-v3-0-better-vae"
+# Use the DPMSolverMultistepScheduler (DPM-Solver++) scheduler here instead
+pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
+pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
+pipe = pipe.to("cuda")
+prompt = "masterpiece, best quality, illustration, beautiful detailed, finely detailed, dramatic light, intricate details, 1girl, brown hair, green eyes, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden"
+negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name"
+with autocast("cuda"):
+    image = pipe(prompt,
+                 negative_prompt=negative_prompt,
+                 width=512,
+                 height=640,
+                 guidance_scale=12,
+                 num_inference_steps=50).images[0]
+image.save("anime_girl.png")
+```
+## Examples
+Below are some examples of images generated using this model:
+**Anime Girl:**
+![Anime Girl](https://huggingface.co/Linaqruf/anything-v3-better-vae/resolve/main/samples/1girl.png)
+**Anime Boy:**
+![Anime Boy](https://huggingface.co/Linaqruf/anything-v3-better-vae/resolve/main/samples/1boy.png)
+**Scenery:**
+![Scenery](https://huggingface.co/Linaqruf/anything-v3-better-vae/resolve/main/samples/scenery.png)
+## License
+This model is open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage.
+The CreativeML OpenRAIL License specifies:
+1. You can't use the model to deliberately produce nor share illegal or harmful outputs or content
+2. The authors claims no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license
+3. You may re-distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL-M to all your users (please read the license entirely and carefully)
+[Please read the full license here](https://huggingface.co/spaces/CompVis/stable-diffusion-license)
+# Announcement
+For (unofficial) continuation of this model, please visit [andite/anything-v4.0](https://huggingface.co/andite/anything-v4.0). I am aware that the repo exists because I am literally the one who (accidentally) gave the idea to publish his fine-tuned model ([andite/yohan-diffusion](https://huggingface.co/andite/yohan-diffusion)) as a base and merged it with many mysterious model, "hey, let's call it 'Anything V4.0'", because the quality is quite similar to Anything V3 but upgraded.
+I also wanted to tell you something. I had a plan to remove/make private one of each repo named "Anything V3":
+- [Linaqruf/anything-v3.0](https://huggingface.co/Linaqruf/anything-v3.0/)
+- [Linaqruf/anything-v3-better-vae](https://huggingface.co/Linaqruf/anything-v3-better-vae)
+Because there are two versions now and I'm late to realize this mysterious non-sense model is already polluted Huggingface Trending for so long, and now when the new repo comes out it is also there. I feel guilty everytime this model is in trending leaderboard.
+I prefer to delete/make private this one and let us slowly move to [Linaqruf/anything-v3-better-vae](https://huggingface.co/Linaqruf/anything-v3-better-vae) with better repo management and a better VAE included in the model.
+Please share your thoughts in this #133 discussion about whether I should delete this repo or another one, or maybe both of them.
+Thanks,
+Linaqruf.
+---
+# Anything V3
+Welcome to Anything V3 - a latent diffusion model for weebs. This model is intended to produce high-quality, highly detailed anime style with just a few prompts. Like other anime-style Stable Diffusion models, it also supports danbooru tags to generate images.
+e.g. **_1girl, white hair, golden eyes, beautiful eyes, detail, flower meadow, cumulonimbus clouds, lighting, detailed sky, garden_**
+## Gradio
+We support a [Gradio](https://github.com/gradio-app/gradio) Web UI to run Anything-V3.0:
+[Open in Spaces](https://huggingface.co/spaces/akhaliq/anything-v3.0)
+## 🧨 Diffusers
+This model can be used just like any other Stable Diffusion model. For more information,
+please have a look at the [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion).
+You can also export the model to [ONNX](https://huggingface.co/docs/diffusers/optimization/onnx), [MPS](https://huggingface.co/docs/diffusers/optimization/mps) and/or [FLAX/JAX]().
+```python
+from diffusers import StableDiffusionPipeline
+import torch
+model_id = "Linaqruf/anything-v3.0"
+branch_name= "diffusers"
+pipe = StableDiffusionPipeline.from_pretrained(model_id, revision=branch_name, torch_dtype=torch.float16)
+pipe = pipe.to("cuda")
+prompt = "pikachu"
+image = pipe(prompt).images[0]
+image.save("./pikachu.png")
+```
+## Examples
+Below are some examples of images generated using this model:
+**Anime Girl:**
+![Anime Girl](https://huggingface.co/Linaqruf/anything-v3.0/resolve/main/1girl.png)
+```
+1girl, brown hair, green eyes, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden
+Steps: 50, Sampler: DDIM, CFG scale: 12
+```
+**Anime Boy:**
+![Anime Boy](https://huggingface.co/Linaqruf/anything-v3.0/resolve/main/1boy.png)
+```
+1boy, medium hair, blonde hair, blue eyes, bishounen, colorful, autumn, cumulonimbus clouds, lighting, blue sky, falling leaves, garden
+Steps: 50, Sampler: DDIM, CFG scale: 12
+```
+**Scenery:**
+![Scenery](https://huggingface.co/Linaqruf/anything-v3.0/resolve/main/scenery.png)
+```
+scenery, shibuya tokyo, post-apocalypse, ruins, rust, sky, skyscraper, abandoned, blue sky, broken window, building, cloud, crane machine, outdoors, overgrown, pillar, sunset
+Steps: 50, Sampler: DDIM, CFG scale: 12
+```
+## License
+This model is open access and available to all, with a CreativeML OpenRAIL-M license further specifying rights and usage.
+The CreativeML OpenRAIL License specifies:
+1. You can't use the model to deliberately produce nor share illegal or harmful outputs or content
+2. The authors claims no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in the license
+3. You may re-distribute the weights and use the model commercially and/or as a service. If you do, please be aware you have to include the same use restrictions as the ones in the license and share a copy of the CreativeML OpenRAIL-M to all your users (please read the license entirely and carefully)
+[Please read the full license here](https://huggingface.co/spaces/CompVis/stable-diffusion-license)

feature_extractor/preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,28 @@

+{
+  "crop_size": {
+    "height": 224,
+    "width": 224
+  },
+  "do_center_crop": true,
+  "do_convert_rgb": true,
+  "do_normalize": true,
+  "do_rescale": true,
+  "do_resize": true,
+  "feature_extractor_type": "CLIPFeatureExtractor",
+  "image_mean": [
+    0.48145466,
+    0.4578275,
+    0.40821073
+  ],
+  "image_processor_type": "CLIPFeatureExtractor",
+  "image_std": [
+    0.26862954,
+    0.26130258,
+    0.27577711
+  ],
+  "resample": 3,
+  "rescale_factor": 0.00392156862745098,
+  "size": {
+    "shortest_edge": 224
+  }
+}

model_index.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "_class_name": "StableDiffusionPipeline",
+  "_diffusers_version": "0.12.0.dev0",
+  "feature_extractor": [
+    "transformers",
+    "CLIPFeatureExtractor"
+  ],
+  "requires_safety_checker": true,
+  "safety_checker": [
+    "stable_diffusion",
+    "StableDiffusionSafetyChecker"
+  ],
+  "scheduler": [
+    "diffusers",
+    "PNDMScheduler"
+  ],
+  "text_encoder": [
+    "transformers",
+    "CLIPTextModel"
+  ],
+  "tokenizer": [
+    "transformers",
+    "CLIPTokenizer"
+  ],
+  "unet": [
+    "diffusers",
+    "UNet2DConditionModel"
+  ],
+  "vae": [
+    "diffusers",
+    "AutoencoderKL"
+  ]
+}

safety_checker/config.json ADDED Viewed

	@@ -0,0 +1,181 @@

+{
+  "_commit_hash": "cb41f3a270d63d454d385fc2e4f571c487c253c5",
+  "_name_or_path": "CompVis/stable-diffusion-safety-checker",
+  "architectures": [
+    "StableDiffusionSafetyChecker"
+  ],
+  "initializer_factor": 1.0,
+  "logit_scale_init_value": 2.6592,
+  "model_type": "clip",
+  "projection_dim": 768,
+  "text_config": {
+    "_name_or_path": "",
+    "add_cross_attention": false,
+    "architectures": null,
+    "attention_dropout": 0.0,
+    "bad_words_ids": null,
+    "begin_suppress_tokens": null,
+    "bos_token_id": 0,
+    "chunk_size_feed_forward": 0,
+    "cross_attention_hidden_size": null,
+    "decoder_start_token_id": null,
+    "diversity_penalty": 0.0,
+    "do_sample": false,
+    "dropout": 0.0,
+    "early_stopping": false,
+    "encoder_no_repeat_ngram_size": 0,
+    "eos_token_id": 2,
+    "exponential_decay_length_penalty": null,
+    "finetuning_task": null,
+    "forced_bos_token_id": null,
+    "forced_eos_token_id": null,
+    "hidden_act": "quick_gelu",
+    "hidden_size": 768,
+    "id2label": {
+      "0": "LABEL_0",
+      "1": "LABEL_1"
+    },
+    "initializer_factor": 1.0,
+    "initializer_range": 0.02,
+    "intermediate_size": 3072,
+    "is_decoder": false,
+    "is_encoder_decoder": false,
+    "label2id": {
+      "LABEL_0": 0,
+      "LABEL_1": 1
+    },
+    "layer_norm_eps": 1e-05,
+    "length_penalty": 1.0,
+    "max_length": 20,
+    "max_position_embeddings": 77,
+    "min_length": 0,
+    "model_type": "clip_text_model",
+    "no_repeat_ngram_size": 0,
+    "num_attention_heads": 12,
+    "num_beam_groups": 1,
+    "num_beams": 1,
+    "num_hidden_layers": 12,
+    "num_return_sequences": 1,
+    "output_attentions": false,
+    "output_hidden_states": false,
+    "output_scores": false,
+    "pad_token_id": 1,
+    "prefix": null,
+    "problem_type": null,
+    "projection_dim": 512,
+    "pruned_heads": {},
+    "remove_invalid_values": false,
+    "repetition_penalty": 1.0,
+    "return_dict": true,
+    "return_dict_in_generate": false,
+    "sep_token_id": null,
+    "suppress_tokens": null,
+    "task_specific_params": null,
+    "temperature": 1.0,
+    "tf_legacy_loss": false,
+    "tie_encoder_decoder": false,
+    "tie_word_embeddings": true,
+    "tokenizer_class": null,
+    "top_k": 50,
+    "top_p": 1.0,
+    "torch_dtype": null,
+    "torchscript": false,
+    "transformers_version": "4.26.0.dev0",
+    "typical_p": 1.0,
+    "use_bfloat16": false,
+    "vocab_size": 49408
+  },
+  "text_config_dict": {
+    "hidden_size": 768,
+    "intermediate_size": 3072,
+    "num_attention_heads": 12,
+    "num_hidden_layers": 12
+  },
+  "torch_dtype": "float32",
+  "transformers_version": null,
+  "vision_config": {
+    "_name_or_path": "",
+    "add_cross_attention": false,
+    "architectures": null,
+    "attention_dropout": 0.0,
+    "bad_words_ids": null,
+    "begin_suppress_tokens": null,
+    "bos_token_id": null,
+    "chunk_size_feed_forward": 0,
+    "cross_attention_hidden_size": null,
+    "decoder_start_token_id": null,
+    "diversity_penalty": 0.0,
+    "do_sample": false,
+    "dropout": 0.0,
+    "early_stopping": false,
+    "encoder_no_repeat_ngram_size": 0,
+    "eos_token_id": null,
+    "exponential_decay_length_penalty": null,
+    "finetuning_task": null,
+    "forced_bos_token_id": null,
+    "forced_eos_token_id": null,
+    "hidden_act": "quick_gelu",
+    "hidden_size": 1024,
+    "id2label": {
+      "0": "LABEL_0",
+      "1": "LABEL_1"
+    },
+    "image_size": 224,
+    "initializer_factor": 1.0,
+    "initializer_range": 0.02,
+    "intermediate_size": 4096,
+    "is_decoder": false,
+    "is_encoder_decoder": false,
+    "label2id": {
+      "LABEL_0": 0,
+      "LABEL_1": 1
+    },
+    "layer_norm_eps": 1e-05,
+    "length_penalty": 1.0,
+    "max_length": 20,
+    "min_length": 0,
+    "model_type": "clip_vision_model",
+    "no_repeat_ngram_size": 0,
+    "num_attention_heads": 16,
+    "num_beam_groups": 1,
+    "num_beams": 1,
+    "num_channels": 3,
+    "num_hidden_layers": 24,
+    "num_return_sequences": 1,
+    "output_attentions": false,
+    "output_hidden_states": false,
+    "output_scores": false,
+    "pad_token_id": null,
+    "patch_size": 14,
+    "prefix": null,
+    "problem_type": null,
+    "projection_dim": 512,
+    "pruned_heads": {},
+    "remove_invalid_values": false,
+    "repetition_penalty": 1.0,
+    "return_dict": true,
+    "return_dict_in_generate": false,
+    "sep_token_id": null,
+    "suppress_tokens": null,
+    "task_specific_params": null,
+    "temperature": 1.0,
+    "tf_legacy_loss": false,
+    "tie_encoder_decoder": false,
+    "tie_word_embeddings": true,
+    "tokenizer_class": null,
+    "top_k": 50,
+    "top_p": 1.0,
+    "torch_dtype": null,
+    "torchscript": false,
+    "transformers_version": "4.26.0.dev0",
+    "typical_p": 1.0,
+    "use_bfloat16": false
+  },
+  "vision_config_dict": {
+    "hidden_size": 1024,
+    "intermediate_size": 4096,
+    "num_attention_heads": 16,
+    "num_hidden_layers": 24,
+    "patch_size": 14
+  }
+}

samples/1boy.png ADDED Viewed

Git LFS Details

SHA256: 977feb68eb35518fa11a4977d6f5658c724536da266da6ca2c998ed9bb5544d7
Pointer size: 132 Bytes
Size of remote file: 2.38 MB

samples/1girl.png ADDED Viewed

Git LFS Details

SHA256: ed516d2bde40f457bddbe9143d6c5175ed8e8954b3a254269835dd359f5322a1
Pointer size: 132 Bytes
Size of remote file: 2.53 MB

samples/scenery.png ADDED Viewed

Git LFS Details

SHA256: 02513474ee2b635407c1fa9192890d9049da9b6cb76a56cf765ff4cd5fa45d45
Pointer size: 132 Bytes
Size of remote file: 1.29 MB

scheduler/scheduler_config.json ADDED Viewed

	@@ -0,0 +1,14 @@

+{
+  "_class_name": "PNDMScheduler",
+  "_diffusers_version": "0.12.0.dev0",
+  "beta_end": 0.012,
+  "beta_schedule": "scaled_linear",
+  "beta_start": 0.00085,
+  "clip_sample": false,
+  "num_train_timesteps": 1000,
+  "prediction_type": "epsilon",
+  "set_alpha_to_one": false,
+  "skip_prk_steps": true,
+  "steps_offset": 1,
+  "trained_betas": null
+}

text_encoder/config.json ADDED Viewed

	@@ -0,0 +1,25 @@

+{
+  "_name_or_path": "openai/clip-vit-large-patch14",
+  "architectures": [
+    "CLIPTextModel"
+  ],
+  "attention_dropout": 0.0,
+  "bos_token_id": 0,
+  "dropout": 0.0,
+  "eos_token_id": 2,
+  "hidden_act": "quick_gelu",
+  "hidden_size": 768,
+  "initializer_factor": 1.0,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 77,
+  "model_type": "clip_text_model",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 1,
+  "projection_dim": 768,
+  "torch_dtype": "float32",
+  "transformers_version": "4.26.0.dev0",
+  "vocab_size": 49408
+}

tokenizer/merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<|startoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "<|endoftext|>",
+  "unk_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,48 @@

+{
+  "add_prefix_space": false,
+  "bos_token": {
+    "__type": "AddedToken",
+    "content": "<|startoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "do_lower_case": true,
+  "eos_token": {
+    "__type": "AddedToken",
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "errors": "replace",
+  "model_max_length": 77,
+  "name_or_path": "runwayml/stable-diffusion-v1-5",
+  "pad_token": "<|endoftext|>",
+  "special_tokens_map_file": "./special_tokens_map.json",
+  "tokenizer_class": "CLIPTokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+    },
+  "errors": "replace",
+  "model_max_length": 77,
+  "name_or_path": "openai/clip-vit-large-patch14",
+  "pad_token": "<|endoftext|>",
+  "special_tokens_map_file": "./special_tokens_map.json",
+  "tokenizer_class": "CLIPTokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer/vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff

unet/config.json ADDED Viewed

	@@ -0,0 +1,44 @@

+{
+  "_class_name": "UNet2DConditionModel",
+  "_diffusers_version": "0.12.0.dev0",
+  "act_fn": "silu",
+  "attention_head_dim": 8,
+  "block_out_channels": [
+    320,
+    640,
+    1280,
+    1280
+  ],
+  "center_input_sample": false,
+  "class_embed_type": null,
+  "cross_attention_dim": 768,
+  "down_block_types": [
+    "CrossAttnDownBlock2D",
+    "CrossAttnDownBlock2D",
+    "CrossAttnDownBlock2D",
+    "DownBlock2D"
+  ],
+  "downsample_padding": 1,
+  "dual_cross_attention": false,
+  "flip_sin_to_cos": true,
+  "freq_shift": 0,
+  "in_channels": 4,
+  "layers_per_block": 2,
+  "mid_block_scale_factor": 1,
+  "mid_block_type": "UNetMidBlock2DCrossAttn",
+  "norm_eps": 1e-05,
+  "norm_num_groups": 32,
+  "num_class_embeds": null,
+  "only_cross_attention": false,
+  "out_channels": 4,
+  "resnet_time_scale_shift": "default",
+  "sample_size": 64,
+  "up_block_types": [
+    "UpBlock2D",
+    "CrossAttnUpBlock2D",
+    "CrossAttnUpBlock2D",
+    "CrossAttnUpBlock2D"
+  ],
+  "upcast_attention": false,
+  "use_linear_projection": false
+}

vae/config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "_class_name": "AutoencoderKL",
+  "_diffusers_version": "0.12.0.dev0",
+  "act_fn": "silu",
+  "block_out_channels": [
+    128,
+    256,
+    512,
+    512
+  ],
+  "down_block_types": [
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D"
+  ],
+  "in_channels": 3,
+  "latent_channels": 4,
+  "layers_per_block": 2,
+  "norm_num_groups": 32,
+  "out_channels": 3,
+  "sample_size": 512,
+  "up_block_types": [
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D"
+  ]
+}