add

Browse files

Files changed (16) hide show

README.md +129 -0
model_index.json +24 -0
scheduler/scheduler_config.json +12 -0
text_encoder/config.json +25 -0
text_encoder/pytorch_model.bin +3 -0
tokenizer/merges.txt +0 -0
tokenizer/special_tokens_map.json +24 -0
tokenizer/tokenizer_config.json +34 -0
tokenizer/vocab.json +0 -0
trinart2_step115000.ckpt +3 -0
trinart2_step60000.ckpt +3 -0
trinart2_step95000.ckpt +3 -0
unet/config.json +36 -0
unet/diffusion_pytorch_model.bin +3 -0
vae/config.json +29 -0
vae/diffusion_pytorch_model.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,129 @@

+---
+inference: true
+tags:
+- stable-diffusion
+- stable-diffusion-diffusers
+- text-to-image
+license: creativeml-openrail-m
+---
+## Please Note!
+This model is NOT the 19.2M images Characters Model on TrinArt, but an improved version of the original Trin-sama Twitter bot model. This model is intended to retain the original SD's aesthetics as much as possible while nudging the model to anime/manga style.
+Other TrinArt models can be found at:
+https://huggingface.co/naclbit/trinart_derrida_characters_v2_stable_diffusion
+https://huggingface.co/naclbit/trinart_characters_19.2m_stable_diffusion_v1
+## Diffusers
+The model has been ported to `diffusers` by [ayan4m1](https://huggingface.co/ayan4m1)
+and can easily be run from one of the branches:
+- `revision="diffusers-60k"` for the checkpoint trained on 60,000 steps,
+- `revision="diffusers-95k"` for the checkpoint trained on 95,000 steps,
+- `revision="diffusers-115k"` for the checkpoint trained on 115,000 steps.
+For more information, please have a look at [the "Three flavors" section](#three-flavors).
+## Gradio
+We also support a [Gradio](https://github.com/gradio-app/gradio) web ui with diffusers to run inside a colab notebook: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1RWvik_C7nViiR9bNsu3fvMR3STx6RvDx?usp=sharing)
+### Example Text2Image
+```python
+# !pip install diffusers==0.3.0
+from diffusers import StableDiffusionPipeline
+# using the 60,000 steps checkpoint
+pipe = StableDiffusionPipeline.from_pretrained("naclbit/trinart_stable_diffusion_v2", revision="diffusers-60k")
+pipe.to("cuda")
+image = pipe("A magical dragon flying in front of the Himalaya in manga style").images[0]
+image
+```
+![dragon](https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/a_magical_dragon_himalaya.png)
+If you want to run the pipeline faster or on a different hardware, please have a look at the [optimization docs](https://huggingface.co/docs/diffusers/optimization/fp16).
+### Example Image2Image
+```python
+# !pip install diffusers==0.3.0
+from diffusers import StableDiffusionImg2ImgPipeline
+import requests
+from PIL import Image
+from io import BytesIO
+url = "https://scitechdaily.com/images/Dog-Park.jpg"
+response = requests.get(url)
+init_image = Image.open(BytesIO(response.content)).convert("RGB")
+init_image = init_image.resize((768, 512))
+# using the 115,000 steps checkpoint
+pipe = StableDiffusionImg2ImgPipeline.from_pretrained("naclbit/trinart_stable_diffusion_v2", revision="diffusers-115k")
+pipe.to("cuda")
+images = pipe(prompt="Manga drawing of Brad Pitt", init_image=init_image, strength=0.75, guidance_scale=7.5).images
+image
+```
+If you want to run the pipeline faster or on a different hardware, please have a look at the [optimization docs](https://huggingface.co/docs/diffusers/optimization/fp16).
+## Stable Diffusion TrinArt/Trin-sama AI finetune v2
+trinart_stable_diffusion is a SD model finetuned by about 40,000 assorted high resolution manga/anime-style pictures for 8 epochs. This is the same model running on Twitter bot @trinsama (https://twitter.com/trinsama)
+Twitterボット「とりんさまAI」@trinsama (https://twitter.com/trinsama) で使用しているSDのファインチューン済モデルです。一定のルールで選別された約4万枚のアニメ・マンガスタイルの高解像度画像を用いて約8エポックの訓練を行いました。
+## Version 2
+V2 checkpoint uses dropouts, 10,000 more images and a new tagging strategy and trained longer to improve results while retaining the original aesthetics.
+バージョン2は画像を1万枚追加したほか、ドロップアウトの適用、タグ付けの改善とより長いトレーニング時間により、SDのスタイルを保ったまま出力内容の改善を目指しています。
+## Three flavors
+Step 115000/95000 checkpoints were trained further, but you may use step 60000 checkpoint instead if style nudging is too much.
+ステップ115000/95000のチェックポイントでスタイルが変わりすぎると感じる場合は、ステップ60000のチェックポイントを使用してみてください。
+#### img2img
+If you want to run **latent-diffusion**'s stock ddim img2img script with this model, **use_ema** must be set to False.
+**latent-diffusion** のscriptsフォルダに入っているddim img2imgをこのモデルで動かす場合、use_emaはFalseにする必要があります。
+#### Hardware
+- 8xNVIDIA A100 40GB
+#### Training Info
+- Custom dataset loader with augmentations: XFlip, center crop and aspect-ratio locked scaling
+- LR: 1.0e-5
+- 10% dropouts
+#### Examples
+Each images were diffused using K. Crowson's k-lms (from k-diffusion repo) method for 50 steps.
+![examples](https://pbs.twimg.com/media/FbPO12-VUAAf2CJ?format=jpg&name=900x900)
+![examples](https://pbs.twimg.com/media/FbPO65cUIAAga8k?format=jpg&name=900x900)
+![examples](https://pbs.twimg.com/media/FbPO_QuVsAAG6xE?format=png&name=900x900)
+#### Credits
+- Sta, AI Novelist Dev (https://ai-novel.com/) @ Bit192, Inc.
+- Stable Diffusion - Rombach, Robin and Blattmann, Andreas and Lorenz, Dominik and Esser, Patrick and Ommer, Bjorn
+#### License
+CreativeML OpenRAIL-M

model_index.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "_class_name": "StableDiffusionPipeline",
+  "_diffusers_version": "0.8.0.dev0",
+  "scheduler": [
+    "diffusers",
+    "PNDMScheduler"
+  ],
+  "text_encoder": [
+    "transformers",
+    "CLIPTextModel"
+  ],
+  "tokenizer": [
+    "transformers",
+    "CLIPTokenizer"
+  ],
+  "unet": [
+    "diffusers",
+    "UNet2DConditionModel"
+  ],
+  "vae": [
+    "diffusers",
+    "AutoencoderKL"
+  ]
+}

scheduler/scheduler_config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "_class_name": "PNDMScheduler",
+  "_diffusers_version": "0.6.0",
+  "beta_end": 0.012,
+  "beta_schedule": "scaled_linear",
+  "beta_start": 0.00085,
+  "num_train_timesteps": 1000,
+  "set_alpha_to_one": false,
+  "skip_prk_steps": true,
+  "steps_offset": 1,
+  "trained_betas": null
+}

text_encoder/config.json ADDED Viewed

	@@ -0,0 +1,25 @@

+{
+  "_name_or_path": "openai/clip-vit-large-patch14",
+  "architectures": [
+    "CLIPTextModel"
+  ],
+  "attention_dropout": 0.0,
+  "bos_token_id": 0,
+  "dropout": 0.0,
+  "eos_token_id": 2,
+  "hidden_act": "quick_gelu",
+  "hidden_size": 768,
+  "initializer_factor": 1.0,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 77,
+  "model_type": "clip_text_model",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 1,
+  "projection_dim": 768,
+  "torch_dtype": "float32",
+  "transformers_version": "4.23.1",
+  "vocab_size": 49408
+}

text_encoder/pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3cc329c12fd921cda5679c859ad7025b1f3e9e492aa3cc3e87994de5287d5bd2
+size 492305335

tokenizer/merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<|startoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "<|endoftext|>",
+  "unk_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+  "add_prefix_space": false,
+  "bos_token": {
+    "__type": "AddedToken",
+    "content": "<|startoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "do_lower_case": true,
+  "eos_token": {
+    "__type": "AddedToken",
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "errors": "replace",
+  "model_max_length": 77,
+  "name_or_path": "openai/clip-vit-large-patch14",
+  "pad_token": "<|endoftext|>",
+  "special_tokens_map_file": "./special_tokens_map.json",
+  "tokenizer_class": "CLIPTokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer/vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff

trinart2_step115000.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:776af18775dfccf29725a994df855e7d8f7b8ea525013e3a466f210ec15c8fd4
+size 2132888288

trinart2_step60000.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8cfec49e460fd191e284aa7165e03bf5539b47b78bce37952b7747feb69f2d45
+size 780140544

trinart2_step95000.ckpt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c1799d22a355ba25c9ceeb6e3c91fc61788c8e274b73508ae8a15877c5dbcf63
+size 2132888288

unet/config.json ADDED Viewed

	@@ -0,0 +1,36 @@

+{
+  "_class_name": "UNet2DConditionModel",
+  "_diffusers_version": "0.6.0",
+  "act_fn": "silu",
+  "attention_head_dim": 8,
+  "block_out_channels": [
+    320,
+    640,
+    1280,
+    1280
+  ],
+  "center_input_sample": false,
+  "cross_attention_dim": 768,
+  "down_block_types": [
+    "CrossAttnDownBlock2D",
+    "CrossAttnDownBlock2D",
+    "CrossAttnDownBlock2D",
+    "DownBlock2D"
+  ],
+  "downsample_padding": 1,
+  "flip_sin_to_cos": true,
+  "freq_shift": 0,
+  "in_channels": 4,
+  "layers_per_block": 2,
+  "mid_block_scale_factor": 1,
+  "norm_eps": 1e-05,
+  "norm_num_groups": 32,
+  "out_channels": 4,
+  "sample_size": 32,
+  "up_block_types": [
+    "UpBlock2D",
+    "CrossAttnUpBlock2D",
+    "CrossAttnUpBlock2D",
+    "CrossAttnUpBlock2D"
+  ]
+}

unet/diffusion_pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:aaa80770bdbdc88da6dc6bbfb82120686a36796d6b73ee4d32f5622c2d6682d9
+size 3438354725

vae/config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "_class_name": "AutoencoderKL",
+  "_diffusers_version": "0.6.0",
+  "act_fn": "silu",
+  "block_out_channels": [
+    128,
+    256,
+    512,
+    512
+  ],
+  "down_block_types": [
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D"
+  ],
+  "in_channels": 3,
+  "latent_channels": 4,
+  "layers_per_block": 2,
+  "norm_num_groups": 32,
+  "out_channels": 3,
+  "sample_size": 256,
+  "up_block_types": [
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D"
+  ]
+}

vae/diffusion_pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b8cf5b49d164db18a485d392b2d9a9b4e3636d70613cb756d2e1bc460dd13161
+size 334707217