Merge 100% of the models instead of only parts

#1
by rombodawg - opened

Is it possible to merge 100% of the models instead of only a % of each, i just dont see why you would cut off data from either model when they each were trained on diffrent coding datasets. Why not merge the entire models together. obviously meaning the adapter models of wizardcoder and phind not base codellama-34b

Both models are used. It's a weighted average with weights that change as a function of position.

  • gradient_values: [0.75] means that the weights are merged with 0.75 weight for model1 and 0.25 for model2.
  • gradient_values: [0.75, 0.25] means that the ratio starts at 0.75 and ends in 0.25 for that layer.

Oh my bad i thought .75 meant 75% of the model was used and 25% of the other model when merging

This is still the best coding AI model I have come across. Are there any plans to create a new version? I suppose maybe merging with a llama 3 coding model or something else... I've tried my coding test I perform on each AI and this is the only one outside of GPT-4 of course that actually creates a working app to my specifications, doesn't repeat itself often, and explains details well. Solid work especially considering that other "newer" models don't seem to be able to pass my coding test.
Does anyone else have any other coding AI model suggestions to try out (just for fun mostly at this point). Obviously I'm going to continue to use this one for my coding assistance for the foreseeable future.

Sign up or log in to comment