Midu's picture
Upload 14 files
dd52bb4
metadata
language: zh
license: creativeml-openrail-m
tags:
  - diffusion
  - zh
  - Chinese

Midu-Stable-Diffusion-2-Chinese-Style-v0.1

Brief Introduction

cyberpunk shiba ds
waitan gf ssh
cat robot castle

大概是huggingface 社区首个开源的Stable diffusion 2 中文模型。该模型基于stable diffusion V2.1模型,在约500万条的中国风格特挑中文数据上进行微调,数据来源于多个开源数据集如LAION-5B, Noah-Wukong, Zero和一些网络数据。

Probably the first open sourced Chinese Stable Diffusion 2 model. This model is finetuned based on stable diffusion V2.1 with 5M chinese style filtered data. Dataset is composed of several different chinese open source dataset such as LAION-5B, Noah-Wukong, Zero and some web data.

Model Details

文本编码器

文本编码器使用冻结参数的lyua1225/clip-huge-zh-75k-steps-bs4096

Text encoder is frozen lyua1225/clip-huge-zh-75k-steps-bs4096 .

Unet

在特挑的500万中文数据集上训练了150K steps,使用指数移动平均值(EMA)做原绘画能力保留,使模型能够在中文风格和原绘画能力之间获得权衡。

Training on 5M chinese style filtered data for 150k steps. Exponential moving average(EMA) is applied to keep the original Stable Diffusion 2 drawing capability and reach a balance between chinese style and original drawing capability.

Usage

因为使用了customed tokenizer, 所以需要优先加载一下tokenizer

# !pip install git+https://github.com/huggingface/accelerate
import torch
from diffusers import StableDiffusionPipeline
torch.backends.cudnn.benchmark = True
pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-v0.1", torch_dtype=torch.float16)
pipe.to('cuda')

prompt = '飞流直下三千尺,油画'
image = pipe(prompt, guidance_scale=7.5).images[0]  
image.save("飞流.png")