These are a set of VAEEncoder.mlmodelc bundles that will enable the image2image feature with MOCHI DIFFUSION 3.2, 4.0, and later, when using incompatible "older" CoreML converted models.
They are provided in a single zip file, containing five VAEEncoder.mlmodelc files, noted by their file names for use as follows:
- for split_einsim 515x515 models
- for original 512x512 models
- for original 512x768 models
- for original 768x512 models
- for original 768x768 models
They should enable image2image for ANY model trained or merged from the Stable Diffusion v1.5 base model. They will also work with models derived from Stable Diffusion v2.0 or v2.1 base models, but Mochi Diffusion has limited support for SD-2.x models.
These VAEEncoders were built from vae-ft-mse-840000-ema-pruned.ckpt. That is the VAE that was distributed with the original Stable Duffusion 1.5 model. It is the VAE used in the vast majority of trained and merged 1.5-type models. There is an alternate VAE, kl-f8-anime.ckpt, that is sometimes used instead, in anime-focused models. I believe that the diferences in that VAE are only relevant to the VAEDecoder. These are replacement VAEEncoders, not VAEDecoders. If your model has the kl-f8-anime VAE baked in, it will still do its job through the VAEDecoder.
- If the model folder that you are upgrading already has a VAEEncoder.mlmodelc file inside, rename that file to VAEEncoder.mlmodelc.bak first, to keep it in case you want to return to it later. It is fine to leave it in the folder. It only needs to be renamed.
- Copy the appropriate new VAEEncoder.mlmodelc to the model folder of the model you are upgrading. The new VAEEncoder.mlmodelc file needs to match the original model in size and compute unit type.
- Examples: If the model is an original 512x768 model, copy the file named NEW-ORIG-512x768-VAEEncoder.mlmodelc. If the model is a split_einsum 512x512 model, copy the file named NEW-SPLIT-512x512-VAEEncoder.mlmodelc.
- After copying the appropriate VAE file, you must rename it to VAEEncoder.mlmodelc. That means removing the NEW-ORIG-512x768-, NEW-SPLIT-512x512-, etc., from the copied file's name.
- The upgraded model should now work with both image2image and text2image.
- IMPORTANT: Remeber that in image2image, your STARTING IMAGE must always be the same size as your model. A 512x512 model will only work with a 512x512 starting image, a 768x512 model will only work with a 786x512 starting image, etc.
- Both the model and the starting image sizes are listed as WIDTH x HEIGHT. 512x768 is a "portrait" orientation. 768x512 is a "landscape" oriemtation.
- Please report any issues that you encounter in the Community Discussion area here so that I can try resolve them.
- Downloads last month