Stable Diffusion 1.5 Latent Consistency Model for RKNN2

(English README see below)

使用RKNPU2运行Stable Diffusion 1.5 LCM 图像生成模型!!

  • 推理速度(RK3588, 单NPU核):

    • 384x384: 文本编码器 0.05s + U-Net 2.36s/it + VAE Decoder 5.48s
    • 512x512: 文本编码器 0.05s + U-Net 5.65s/it + VAE Decoder 11.13s
  • 内存占用:

    • 384x384: 约5.2GB
    • 512x512: 约5.6GB

使用方法

1. 克隆或者下载此仓库到本地.

2. 安装依赖

pip install diffusers pillow numpy<2 rknn-toolkit-lite2

3. 运行

python ./run_rknn-lcm.py -i ./model -o ./images --num-inference-steps 4 -s 512x512 --prompt "Majestic mountain landscape with snow-capped peaks, autumn foliage in vibrant reds and oranges, a turquoise river winding through a valley, crisp and serene atmosphere, ultra-realistic style."

image/png

模型转换

安装依赖

pip install diffusers pillow numpy<2 rknn-toolkit2

1. 下载模型

下载一个onnx格式的Stable Diffusion 1.5 LCM模型,并放到./model目录下。

huggingface-cli download TheyCallMeHex/LCM-Dreamshaper-V7-ONNX
cp -r -L ~/.cache/huggingface/hub/models--TheyCallMeHex--LCM-Dreamshaper-V7-ONNX/snapshots/4029a217f9cdc0437f395738d3ab686bb910ceea ./model

理论上你也可以通过将LCM Lora合并到普通的Stable Diffusion 1.5模型,然后转换为onnx格式,来实现LCM的推理。但是我这边也不知道怎么做,有知道的小伙伴可以提个PR。

2. 转换模型

# 转换模型, 384x384分辨率
python ./convert-onnx-to-rknn.py -m ./model -r 384x384 

注意分辨率越高,模型越大,转换时间越长。不建议使用太大的分辨率。

已知问题

  1. 截至目前,使用最新版本的rknn-toolkit2 2.2.0版本转换的模型仍然存在极其严重的精度损失!即使使用的是fp16数据类型。如图,上方是使用onnx模型推理的结果,下方是使用rknn模型推理的结果。所有参数均一致。并且分辨率越高,精度损失越严重。这是rknn-toolkit2的bug。 (v2.3.0已修复)

  2. 其实模型转换脚本可以选择多个分辨率(例如"384x384,256x256"), 但这会导致模型转换失败。这是rknn-toolkit2的bug。

参考

English README

Stable Diffusion 1.5 Latent Consistency Model for RKNN2

Run the Stable Diffusion 1.5 LCM image generation model using RKNPU2!

  • Inference speed (RK3588, single NPU core):
    • 384x384: Text encoder 0.05s + U-Net 2.36s/it + VAE Decoder 5.48s
    • 512x512: Text encoder 0.05s + U-Net 5.65s/it + VAE Decoder 11.13s
  • Memory usage:
    • 384x384: About 5.2GB
    • 512x512: About 5.6GB

Usage

1. Clone or download this repository to your local machine

2. Install dependencies

pip install diffusers pillow numpy<2 rknn-toolkit-lite2

3. Run

python ./run_rknn-lcm.py -i ./model -o ./images --num-inference-steps 4 -s 512x512 --prompt "Majestic mountain landscape with snow-capped peaks, autumn foliage in vibrant reds and oranges, a turquoise river winding through a valley, crisp and serene atmosphere, ultra-realistic style."

image/png

Model Conversion

Install dependencies

pip install diffusers pillow numpy<2 rknn-toolkit2

1. Download the model

Download a Stable Diffusion 1.5 LCM model in ONNX format and place it in the ./model directory.

huggingface-cli download TheyCallMeHex/LCM-Dreamshaper-V7-ONNX
cp -r -L ~/.cache/huggingface/hub/models--TheyCallMeHex--LCM-Dreamshaper-V7-ONNX/snapshots/4029a217f9cdc0437f395738d3ab686bb910ceea ./model

In theory, you could also achieve LCM inference by merging the LCM Lora into a regular Stable Diffusion 1.5 model and then converting it to ONNX format. However, I'm not sure how to do this. If anyone knows, please feel free to submit a PR.

2. Convert the model

# Convert the model, 384x384 resolution
python ./convert-onnx-to-rknn.py -m ./model -r 384x384 

Note that the higher the resolution, the larger the model and the longer the conversion time. It's not recommended to use very high resolutions.

Known Issues

  1. As of now, models converted using the latest version of rknn-toolkit2 (version 2.2.0) still suffer from severe precision loss, even when using fp16 data type. As shown in the image, the top is the result of inference using the ONNX model, and the bottom is the result using the RKNN model. All parameters are the same. Moreover, the higher the resolution, the more severe the precision loss. This is a bug in rknn-toolkit2. (Fixed in v2.3.0)

  2. Actually, the model conversion script can select multiple resolutions (e.g., "384x384,256x256"), but this causes the model conversion to fail. This is a bug in rknn-toolkit2.

References

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for happyme531/Stable-Diffusion-1.5-LCM-ONNX-RKNN2

Quantized
(1)
this model