In the space of stable diffusion models, model merging seems to be super common. There are two techniques that I've seen:
- Weighted average (given A and B are two models with identical shapes)
output = (1-α)A + αB
- Added diff (given A, B are models finetuned from C)
output = A + α(B - C)
The intuition I've gathered is that weighted average is akin to mixing the datasets, while added diff is more like additional finetuning. I'm not sure if there's any research out there that takes a properly analysis, but this is what I've heard and understood. Obviously both methods raise eyebrows, but it's cool that they can sometimes work.
The second technique has, with mixed results, enabled the transfer of inpainting capabilities to other models without forgetting.
I'm assuming that ppo_hh_gpt-j was finetuned on GPT-J. Since the two models you've averaged are finetuned off of the same baseline, have you considered the second formula above?
Let me know what you think, or if there's any research you know of that I could study up on.
I've been curious as well if difference merge is possible with language models. Frankly I'm floored weight merging is possible in this space. I'm excited to find out what else is possible. I may eyeball Automatic1111's difference merge script and see what might be possible to retrofit into the merge script I used to make this model.
Caveat, I'm new to Python, capable of augmenting or heavily modifying the work of others but at the moment I'm babby. I'm not the original author of the weight average script, although I have converted it to a Colab environment and plan to officially release it in a week or two. The KoboldAI Discord has the original .py under the model merging channel.
The author of the merge script I used is the work of Concedo: https://huggingface.co/concedo
Thanks for the info. I'll work on modifying that
merge.py script to support difference merge. Here's hoping it works.
for anyone who comes across this discussion: