How to Use Supermerger

#66
by Veil - opened

The Supermerger add-on for the Automatic1111 Web UI looks very interesting, but I can't understand how to make use of it from the usage instructions. Is there anywhere that explains how to properly use this tool?

You can use this instead https://github.com/bbc-mc/sdweb-merge-block-weighted-gui it does the same thing and it have better ui.

The usage instructions for that tool aren't really any better. How is one to determine what values to use in order to achieve a good or desired result?

There is no universal guide for it, you need to do own testing. First step is understanding how unet works. Some stuff, that should generally work for any model is that first and last layers will have large impact on style and that middle layer will have impact on composition.

Not sure why I wouldn't just use the standard model merger in that case. What's the point of entering a bunch of random numbers into this more complicated system and hoping for some miracle to occur?

Like it can help you if you are expecting something specific to happen. Examples: keep style of one model, mixing sfw and nsfw models, fixing color/vae issues.

Like it can help you if you are expecting something specific to happen. Examples: keep style of one model, mixing sfw and nsfw models, fixing color/vae issues.

Like it can help you if you are expecting something specific to happen. Examples: keep style of one model, mixing sfw and nsfw models, fixing color/vae issues.

I'll give this one last try. Let's say that I want to use this tool to keep the style of a model while changing some other aspect of it. How would I go about that, i.e., how would I determine what values to use to accomplish that goal?

For example this will pretty much keep the style of modelA and modelB will have like 40% influence on generated content
Snímek obrazovky 2023-03-07 221735.png

Basically, IN00-04 and OUT11-08 works a lot with style, and the more you go to the bottom of the U the more the layers affect composition. Because the upper layers of the U works with higher resolutions thus can't change composition but can change style and the bottom layers of the U works with low resolutions thus can change composition but can't change style.

Here is the visualization
remotesensing-12-02001-g003.png

Is that means the IN and OUT value is good for playing around with the style and/or the composition from the primary model and the output?

According to the visualization that you give, IN00-04 is the best for playing around with the style of the primary model and how much it contribute with the secondary model to the output . Am I correct?

My mind simply get it as the advanced version of the regular checkpoint merger.

Yeah, you are right pretty much.

deleted

I'm sorry. But how can we understand to set the value in each layers? 🤔

Well aside from what I already said there are no rules. You can only experiment with stuff, create XYZ plots and compare them

Old discussion, I don't know what's the etiquette about bumping, but this was linked from civitai. As someone that has been doing stuff like this for a while:

https://huggingface.co/Yntec/lamettaRemix
(Grabbing two models I like, and using Merging Block Weights until it produces my favorite things from both, in this case, the best version of lametta for compositions with the best for eyes)

This is my advice:

1.- Don't bother with intermediate values, just try 0s and 1s, what you have in mind is on there, intermediate values chicken out and don't give the full weights.

2.- Use Weight Sum Train Difference, it'll automatically subtract Model A from the merge and train the difference between the models as if the missing training data was added. This produces the best results, and it's even an improvement over LoRA merging (you bake a LoRA into a model, then you use Weight Sum Train Difference to train the difference of the LoRA into the original model.) Sometimes you get really close to what you wanted but the eyes are messed up, Weight Sum Train Difference at alpha 1 doesn't have this problem.

3.- Think of the most important part of the UNET as A,B,C,D. The very first thing I do is doing a merge of the models with the value 0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1

Think of it as follows:

A=0 (base)
B=0,0,0,0,0,0,0,0,0,0,0,0 (12 IN blocks)
C=1 (M00 - the middle block)
D=1,1,1,1,1,1,1,1,1,1,1,1 (12 OUT blocks)

What you have in mind has a composition or a style, does this have it? If so, you're going to save half of the tests you were going to make, because you got it already. Otherwise, try this same thing again, but with Model A and Model B swapped out.

4.- The base should always be the opposite of the M00 block, so, if the model doesn't already look the way you imagined, you need to try these values:

0,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1
1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0

And now, try them with the models swapped. By now you have 8 merges, it should suffice to know which one is the closest to your wanted output, you can now leave A and C the way they are, and Model A and Model B in the order that they are, and continue from there.

5.- I don't know which ones of the 8 ones you'll be working with, so now I'll substitute A and C for X that means whatever you're using, so now these are the magic numbers to try:

X,1,1,1,1,1,1,0,0,0,0,0,0,X,0,0,0,0,0,0,1,1,1,1,1,1
X,0,0,0,0,0,0,1,1,1,1,1,1,X,1,1,1,1,1,1,0,0,0,0,0,0

If you wanted one of the models' compositions with the other's style, you should already have it. Now that you have it, try swapping the order of the models and inverting all the 0s for 1s and all the 1s for 0s and try it, it may give something even better.

6.- If it still doesn't do what you want, try these ones:

X,1,1,1,1,1,1,0,0,0,0,0,0,X,1,1,1,1,1,1,0,0,0,0,0,0
X,0,0,0,0,0,0,1,1,1,1,1,1,X,0,0,0,0,0,0,1,1,1,1,1,1

If at any point you're now getting outputs with the wrong model's composition, swap the order of the models, or swap the A and C blocks, otherwise, you're still saving time because you don't need to do the versions with the models swapped or with the A and C blocks swapped, or both, so you're only needing to test 1/4 of the possibilities!

7- If you're trying to bring something specific that you want, you should have by now about 16 merges, one of them should have it with the wrong composition or another problem, so now, you start tweaking the numbers that were the closest, again the order of the models and the base and M00 should be set in stone by now. Suppose this one was the closest:

1,0,0,0,0,0,0,1,1,1,1,1,1,0,1,1,1,1,1,1,0,0,0,0,0,0 (Model A as Model 2 and Model B as Model 1)

You can try adding a 1 to B and adding a 0 to C, like this:

1,0,0,0,0,0,1,1,1,1,1,1,1,0,1,1,1,1,1,0,0,0,0,0,0,0

Or the opposite:

1,0,0,0,0,0,0,0,1,1,1,1,1,0,1,1,1,1,1,1,1,0,0,0,0,0

See what they do, what gets you closer, at some point it may be obvious what numbers have to remain as they are or you get worse results.

Suppose this one was the closest one:

0,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0 (Model A as Model 1 and Model B as Model 2)

Now just on B swap a 0 and a 1, and make D copy what you changed, like this:

0,1,1,0,1,1,1,0,0,1,0,0,0,1,1,1,0,1,1,1,0,0,1,0,0,0

And then you'd just keep trying different strings like these until you hit the jackpot. By reading this text you may not know what are valid strings to try, but after actually doing the merges you develop intuition, and after comparing your best merges with the previous merges it should be clear what bits to change for the next attempt.

8- And now you could ask "why don't you just tell me what is going to change?", and the thing is, I don't know, it depends, and most importantly, a change you make may cause the output to be the complete opposite of what you intended. Somehow. When that happens I realize I have no idea what I'm doing, but usually swapping Model 1 and 2's spots, or blocks A and C, or both, gets me back on track and produces something better than all previous attempts. What you need to know is that a string of 0s and 1s exist that produce what you want, and this is a method to the madness, all we're doing is abusing binary search to minimize the space you need to explore, so instead of making 128 merges and building a grid, you just do it in 32 cuts, because from the first 8 ones we already know the ones you don't need to make as 3/4 of the merges will have the wrong style or composition.

9- And now you could ask "why do you always keep the same numbers of 1s and 0s?", and that's because this is a 50% merge of both models, it gives them both a fair chance to deliver what you wanted. Once you have one you're happy with, you could just swap a single 0 to a 1 so it's closer to one of the models, and see what happens, set both base and M00 as 0s or 1s, maybe even start using 0.5 to see what in between places look like, go bonkers about it! But first you need to find the half-way point you're happy with.

I've never gotten this far, because at some point in the tweaking you find a surprise... a style better than the one you were going for, a new composition that improves the one of the best model, a grand slam out of nowhere! And you're done. You didn't get what you wanted but you can point at your merge and say "yes, this is better than both models", and you share it.

Hi! I'm also sorry about the bump, but I've been researching SuperMerger and MBW lately. I actually bothered to read the GitHub readme in its entirety, which took an astounding 15 minutes - so, I thought I'd clear up some misinformation I've seen passed around, just to help out anyone else Googling around for info! Funnily enough, some people can merge hundreds of models without a logical understanding of the technology they've used!

  1. It is impossible to do a "Weight Sum" trainDifference merge. The calcmode trainDifference is ONLY available in "Add Difference" mode. Source: https://github.com/hako-mikan/sd-webui-supermerger/blob/main/calcmode_en.md#train

  2. When MBW is enabled with the "Add Difference" or "Weight Sum" modes, it's not possible to set "alpha 1." Instead, the BASE value in MBW will be used as the "alpha" multiplier. Source: https://github.com/hako-mikan/sd-webui-supermerger#merge-mode
    2.1. Following that logic, in theory, setting BASE to 0 with these modes will cause model B to NOT be merged at all. Instead, you will get only model A with block weighting applied to its UNET.

  3. Only IN, MID, OUT blocks are part of the UNET. The BASE block is the text encoder. Source: https://github.com/hako-mikan/sd-webui-supermerger#merge-block-weight

  4. BASE does not need to have an opposing value to the M00 block. In fact, they have absolutely no correlation, so setting them to the same value has no adverse effect! Source: See # 2 and # 3 above.

  5. Intermediate values between 0 and 1, although difficult to work with, can result in smooth merges. Source: Built-in presets on the extension and some popular merges which posted their MBW recipes.

  6. Setting BASE to 0 repeatedly would most likely "cause the output to be the complete opposite of what you intended" and also cause the model merger to realize they "have no idea what they're doing." Source: # 2.1 above.

Hope that helps everyone, and remember to always double-check sources when researching online! :)

It is impossible to do a "Weight Sum" trainDifference merge. The calcmode trainDifference is ONLY available in "Add Difference" mode. Source

Your source is wrong, read the logs of what is happening. Whenever it's used Model A is set as model C, and the calculations are carried out as if you used Add Difference and set Model A as Model C.

Following that logic, in theory, setting BASE to 0 with these modes will cause model B to NOT be merged at all.

Your source is wrong. Whatever you do, Supermerger will act as if the text encoder was a part of UNET, if you pretended that OUT11 was the base model and tested your merges, you'd get the same behavior because the text encoder behaves as a UNET block.

BASE does not need to have an opposing value to the M00 block

I was giving advice about how to produce the best quality models, and ALL the models I have provided to the community that have used merged block weights have used this opposition. Because if you don't, then your model is likely to look too much like Model A, or like Model B, same style and composition, opposing them is the way to get something new and unique, and not yet another merge that looks like any other merge out there. You can check my recent released models Protogen Unofficial and Prodigy, they have novel compositions you won't find in other models, and that requires opposition.

Intermediate values between 0 and 1, although difficult to work with, can result in smooth merges.

Your source is wrong, what you call smooth is lack of detail, people have continued to merge models to the point characters appear on empty backgrounds, and people love them because of the NAME of the models. I have tested it, if I include a popular model in my merge and the model's name, it may get 60000 downloads here on hugging face, another model that merges unknowns, that clearly outperforms it in any prompt, with an original name, may not get to 3000 uploads. Try it, merge DreamShaper 8 and EpicPhotoGasm and call it EpicDreamShaperPhotoGasm and see how people will love it and praise it and download it more than any one I made, and it won't matter if it's smooth or not, because popularity of models is not an indication of quality.

The advantage of merging hundreds of models is that you can point out when the documentation is wrong, and when undocumented features are clearly the best way to proceed to get to where you want. No offence @Heather95 , but like we'd say in 4chan, you sound like someone giving dating advice because you read a book about seduction and can use it as your source of advice, while my advice is different because I've been with actual partners.

Don't point out to the text of a github page, show your work, I'd love to see it, I can point out to the thousands of monthly downloads of the models I've merged and uploaded to huggingface, in case people want to achieve that kind of success with model merging by following my advice.

Your source is wrong

Congratulations! I'm impressed. The "wrong source," as you call it, is the creator of SuperMerger explaining how it works. I cited the literal GitHub readme, entirely written by the creator of SuperMerger.

I don't have to spend hours effectively plagiarizing whoever created Model A with the base 0. Granted, the coder of SuperMerger could have written out a mistake in the readme, which means it might be possible to use MBW base 0 with WeightSum or AddDifference and still get model B in the merge. But I seriously doubt that, and I'm not going to bother. Make your own X/Y comparisons with a "merge" and Model A or B and figure it out.

Other than both the creator of SuperMerger saying the following is impossible and it literally being impossible in the extension itself, how would a trainDIFFERENCE merge work as a weighted sum? "sum" means adding things together, while "difference" means subtracting things from each other.

Edit:
Okay so, to be genuine here, I'm not going to reply to you any more for both of our sakes, but I thank you for spending the time to type out a response, genuinely. Someone could tell you, "Read the manual!" And you'd say, "But the manual is wrong! I don't believe it! I'M right!"

Test it yourself if you love merges so much.

Same to you, if you did the testing you would know I've depicted accurately what happens when you try it, regardless of what the creator of the tool says, you never considered the possibility that the creator of Supermerger doesn't know how their tool works.
I'm sorry about your loss, at the end of the day there are much more important things in life than who is right on a discussion.

the thing is, I don't know, it depends, and most importantly, a change you make may cause the output to be the complete opposite of what you intended. Somehow. When that happens I realize I have no idea what I'm doing

interesting.

Sign up or log in to comment