[Converter Help] how the new converter works ?
Hi,
thank you, new update looks fire ! :D
It's very early, butI've been refreshing this page for days xD, can you guide a little more on how to use the new converter ?
- I tried textual inversion (.pt) but didn't work, I guess there is a conversion step before that ? (I was lazy I just converted "text-encoder" and replaced those files)
- Looks like custom controlnet res, can you help or list a controlnet list, didn't try Illyasviel repo cause it felt wrong (5gb)?
Congratulations, amazing job
Hello again @ZProphete ! Thanks for the comment :)
- On textual inversion, you should select a folder with the embedding/s inside, after that make sure you are converting the TextEncoder (is the only one affected) and that “Load embeddings” is enabled. Once converted you should be able to right click “Show package contents” on the TextEncoder.mlmodelc and there should be a new added_vocab.json with the new tokens, those are the token you should used. If that worked you can replace in the model’s folder the TextEncoder and load it in Guernika and it should work. There is a new version Guernika 3.0 waiting for review by Apple that makes managing and changing the TextEncoder easier but on the latest one it should work already by manually changing it.
- For ControlNet you can use the ones listed here for 1.X or here for 2.x. For example this one or this one. I will upload the 2.1 ones on the main repo once I convert them all, I’m sure there are other ones already but I haven’t found them.
Please, let me know if you have any more success with those tips and thank you for testing literally everything.
You got a little typo,
At some point I successfully got Textual Inversion converted, but wasn't sure if it actually worked cause I didn't set comparison point and Obama wasn't a good TI to tryout...
but now I'm trying again I got this module error ? :)
Sadly, controlnet conversion crashes, even on default setting using url or local folder.
edit : How do you clear the paths in the model origin tabs :) ?
My pleasure @GuiyeC , wish you great success.
@ZProphete
I will check that out.
Why do you want to clear the paths? You only convert the selected origin
@ZProphete
uploaded 3.0.1 which fixes the typo on ALL
and ControlNet conversion, I'm not sure about the pytorch_lightning error, are you getting that consistently with one model? if so, could you share which one is it?
Also, Guernika 3.0 should already be ready in the App Store :)
@GuiyeC Everything is now fine with 3.0.1 and Guernika 3.0 !
The "custom token" interface is clever,
for some reason, the trigger word provided (from Civitai), is different than the one found in "custom token" interface ?I'm experiencing little bug while swaping text-encoder, the first one loaded stays in memory,
I had to quit and change text-encoder at app start.Actually I'm getting the pytorch_lightning error on "v1-5-pruned-emaonly.ckpt", I redownloaded but same error, tried a few it others it was fine.
It's definitely working ! Nice !
Converter feels faster too (y)
What are your thoughts on LoRA embeddings (not merged)? I know actual coreML implementation is pretty rigid, hence all the conversions, but you are certainly a smart guy.
No pressure, you are probably working on img2img masks, right ? ;D
- I saw that too, I'm loading the one in the embedding file, I believe that some implementations allow you to pick what token to use and that's what they mean in CivitAI, I could do that too but the process might get overcomplicated.
- What do you mean it stays in memory? What are you seeing? Can you give me steps to reproduce?
- Is this one it? I will give it a try.
I gave LoRA another look but I don't think that's gonna be possible any time soon, I thought maybe doing something like with the TextEncoder, having them permanent in a converted Unet but it changes the Unet quite a lot so that would require quite a lot more changes.
What do you mean with img2img masks? XD I thought about maybe improving the UI for the next version, not sure how but some of you have already given me quite a lot of ideas.
Could you also indicate what Compute units and modules were you trying to convert when you got the pytorch_lightning error?
EDIT: Neverming, I just saw it with just the TextEncoder 👌
@ZProphete Updated Guernika Model Converter to 3.0.2, that should fix the problem with the pytorch_lightning module
Sure, here are the steps to reproduce :
- generate one image with new added TI token, but on default text-encoder.
- switch to "new text encoder", generate the same, same seed == image(1) and image(2) are the same.
- I restart Guernika, select "new text encoder" generate the same, same seed, image is now different (TI working), and vice versa, notices a few crashed when playing a lot with the select-menu, maybe that's just my computer.
- Affirmative, this one link !
- an important feature to me is img2img mask, similar to inputting brush, to add details here and there protecting the rest of the image, specially the "whole-image"vs"only-masked" option like the popular webui.
With "only-masked" activated it creates an area around the painted area, and treat it as a full res img2img input
there is a blend-padding slider to control how it blends around the painted brush or something like that.
You once proposed a draggable square zone a-la-Dall-e, pretty challenging, I supposed its the same spirit (I'm all for it!)
It could be some unified tab for inpainting-outpainting-addingdetails :
You could delete/inpaint, or mask+img2img on the canvas, (edit :) Dragging and panning/zooming the canvas around while the square selection stays centered, everything under the square is considered 512x512 image input, Idk :)
I also noticed that when you drop 512x768 into let's say 640x960 img2img input, W and H are reversed (saw that on apple's repo too), causing a missing space in generation also visible in img2img preview, but if you resize it manually using Preview app at 640x960 and use this as input it works as is... maybe if you resize it via Guernika you got a quick fix.
I'm very pleased on how it adds details so much quicker than the popular web-ui (the timer count is off while using this upscaling technique, but it's what, 30-40 seconds total, nothing like 4minutes on the web-ui). That's good !
If you are looking for new challenges I also like :
- https://github.com/opparco/stable-diffusion-webui-two-shot (I loved these examples : https://youtu.be/uR89wZMXiJ8?t=697)
Well done !!
quick questions just from curiosity :
If custom res is not possible via Neural Engine or All Compute, why still allowing models and specially controlnet to be converted to it ? Can we expect apple to fix it or you got something in your sleeve ?
I wish at least All Compute works, even a little slower as it would free some resources.The less is the conditioning weight in controlnet the slower it gets, cpu usage increases, gpu usage decreases.
It does not bother me as for my usages of Human Pose, it's always set at 100%, but I was wondering why.
Thank for your time :)
@ZProphete let's see point by point:
- I fixed the Text Encoder bug, the problem was that the tokens are not being recalculated after changing the encoder, changing the prompt would make them be recalculated but in the next version this will be fixed.
- I will look into the img2img mask and the call-e canvas, but I can't promise anything 😅
- I will also look into the resizing problem, probably it is a quick fix, thanks for the report 🙏
- Custom res not being possible view NE or All: That is true, I should probably disallow using anything other than GPU, the only problem is that I don't really don't know what the original resolution of the model would be, I'm not 100% sure that anything other than 512x512 does not work on the NE, maybe SD 2.1 does, I have to check.
- The thing here is that when adding conditioning weight, every value generated by the ControlNet is multiplied by it, this step is skipped when conditioning is 100%. I tried improving this a bit on the latest version but it is still noticeably slower.
@ZProphete
I just tested it and I got the same error, I don't think there is much I can do there, it tries to load the CKPT and it does not have the configuration file (.yaml) that tells it how many in_channels this model has, in this case 9, so it defaults to the configuration file of a regular model, which has 4 in_channels.
You can use the HuggingFace identifier or maybe try your luck with another config file, this one seemed to work, but I would just use the HuggingFace identifier instead. If you use the config file, remember to rename it so it has the same name but .yaml extension, in this case sd-v1-5-inpainting.yaml
.
@GuiyeC
It's converting ! I wasn't aware of the config file :)
Thank you ! 🙏
And congrats : new App Store ranking !