jrrjrr's picture
Upload README.md
58b254b
|
raw
history blame
4.7 kB
metadata
license: creativeml-openrail-m
tags:
  - coreml
  - stable-diffusion
  - text-to-image

These are Stable Diffusion v1.5 type models and compatible ControlNet v1.1 models that have been converted to Apple's CoreML format

For use with a Swift app or the SwiftCLI

The SD models are all "original" (not split-einsum) and built for CPU and GPU. They are each for the output size noted. They are fp16, with the standard SD-1.5 VAE embedded.

The Stable Diffusion v1.5 model and the other SD 1.5 type models contain both the standard Unet and the ControlledUnet used for a ControlNet pipeline. The correct one will be used automatically based on whether a ControlNet is enabled or not.

They have VAEEncoder.mlmodelc bundles that allow Image2Image to operate correctly at the noted resolutions, when used with a current Swift CLI pipeline or a current GUI built with ml-stable-diffusion 0.4.0, such as Mochi Diffusion 3.2 or later.

All of the ControlNet models are also "original" ones, built for CPU and GPU compute units (cpuAndGPU) and for SD-1.5 type models. The zip files each contain have a set of models at 4 resolutions. They may also work with split-einsum models, using CPU and GPU (not CPU and NE), but they will not work with SD-2.1 type models at all.

All of the models in this repo will only work with Swift and the current ml-stable-diffusion pipeline (0.4.0). They were not built for a python diffusers pipeline. They need apple/ml-stable-diffusion (from GitHub) for command line use or a Swift app that supports ControlNet, such as the Mochi Diffusion test version currently in a closed beta test at https://github.com/godly-devotion/MochiDiffusion that supports ControlNet. Join the Mochi Difusion Discord server at https://discord.gg/x2kartzxGv to request access to the beta test version.

The full SD models are in the "SD" folder here. They are in subfolders by model name and individually zipped for a particular resolution. They need to be unzipped for use after downloading.

The ControlNet model files are in the "CN" folder here. They are also zipped and need to be unzipped after downloading. Note that they are zipped into sets of 4: 512x512, 512x768, 768x512, 768x768 for each ControlNet type.

There is also a MISC folder that has text files with my notes and a screencap of my directory structure. These are provided for folks who want to try converting models themselves and/or running the models with a SwiftCLI. The notes are not perfect, and may be out of date as the various python and CoreML packages are updated.

For command line use, it all runs in a miniconda3 environment, covered in one of the notes. If you are using the command line, please read the notes concerning naming and placement of your ControlNet model folder. If you are using a GUI, it will most likely guide you to the correct location/arrangement.

The sizes are always meant to be WIDTH x HEIGHT. A 512x768 is "portrait" orientation and a 768x512 is "landscape" orientation.**

If you encounter any models that do not work fully with image2image and ControlNet, using the current apple/ml-stable-diffusion SwiftCLI pipeline or Mochi Diffusion 3.2 or the Mochi Diffusion CN test build, please leave a report in the Community area here. If you would like to add models that you have converted, leave a message as well, and I'll try to figure out out to grant you access to this repo.

Model List

Models are organized into folders by model name. Each folder contains 4 zip files of single models for the output size indicated: 512x512, 512x768, 768x512 or 768x768.

  • DreamShaper v5.0, 1.5-type model, original, for ControlNet & Standard <<<=== NEW <<<=== NEW
  • GhostMix v1.1, 1.5-type anime model, original, for ControlNet & Standard
  • MeinaMix v9.0 1.5-type anime model, original, for ControlNet & Standard
  • MyMerge v1.0 1.5-type NSFW model, original, for ControlNet & Standard
  • Realistic Vision v2.0, 1.5-type model, original, for ControlNet & Standard
  • Stable Diffusion v1.5, original, for ControlNet & Standard

ControlNet List

Each file is a set of 4 resolutions zipped together: 512x512, 512x768, 768x512, 768x768

  • Canny -- Edge Detection, Outlines As Input
  • Depth -- Reproduces Depth Relationships From An Image
  • InPaint -- Modify An Indicated Area Of An Image (not sure how this works)
  • InstrP2P -- Instruct Picture2Picture, Modified By Text ("change dog to cat")
  • LineArt -- Find And Reuse Small Outlines
  • MLSD -- Find And Reuse Straight Lines And Edges
  • OpenPose -- Copy Body Poses
  • Scribble -- Freehand Sketch As Input
  • SoftEdge -- Find And Reuse Soft Edges
  • Tile -- Subtle Variations In Batch Runs