Edit model card

BETTER THAN GOLIATH?!

I've merged Euryale-lora that I made with Xwin and then merged it with itself in goliath-style merge using mergekit. The resulting model performs better than goliath on my tests(note: performance on tests is not necessarily performance in practice). Test it, have fun with it. This is a sister model of Premerge-EX-EX-123B.

Prompt format

Alpaca.

Ideas behind it

Since the creation of Goliath I was wondering if it was possible to make something even better. I've tried linear, passthrough, SLERP, TIES-merging models, but I could not recreate the greatness of goliath, at least not in a way that I liked in practical use. I knew about the existence of LORAs but I didn't know how well they performed. I created a model named Gembo by merging a shitton of LORAs together, and surprisingly it worked! In fact it worked so well that it was the best model on my benchmarks until now. When I found a tool named LORD, which can extract LORA from any model, I knew I could do something even better.

I've extracted LORA from Euryale, then from Xwin and began testing. Merging Euryale-lora to Xwin and the other way around, created better models, which outperformed their parents:

Name Quant Size B C D S P total BCD SP
Sao10K/Euryale-1.3-L2-70B Q6_K 70B 0 2 0 3 5 10 2 8
Sao10K/Euryale-1.3-L2-70B+xwin-lora Q6_K 70B 2 2 1 5.5 5.5 16 5 11
Xwin-LM/Xwin-LM-70B-V0.1 Q6_K 70B 0 1 2 5.5 5.25 13.75 3 10.75
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora Q6_K 70B 3 2 2 6 5 18 7 11

Results seemed promising, so I continued testing, merging it in goliath-like way in different orders(EX=Euryale+LORAXwin; XE=Xwin+LORAEuryale). The results were even more surprising:

Name Quant Size B C D S P total BCD SP
alpindale/goliath-120b Q6_K 120B 3 2 1 6 6 18 6 12
ChuckMcSneed/Premerge-EX-EX-123B Q6_K 123B 2 2 1.5 7.25 6 18.75 5.5 13.25
ChuckMcSneed/Premerge-EX-XE-123B Q6_K 123B 2 2 2 5.75 6 17.75 6 11.75
ChuckMcSneed/Premerge-XE-EX-123B Q6_K 123B 2 2 2.5 6.75 5.5 18.75 6.5 12.25
ChuckMcSneed/Premerge-XE-XE-123B(this model) Q6_K 123B 3 2 2.5 7.25 5.25 20 7.5 12.5
Sao10K/Euryale-1.3-L2-70B+xwin-lora Q6_K 70B 2 2 1 5.5 5.5 16 5 11
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora Q6_K 70B 3 2 2 6 5 18 7 11

Contrary to my expectations, merging two different models was suboptimal in this case. Selfmerge of Euryale-LORAXwin did beat all of the other merges on SP tests(creative writing), making it the highest scoring model on those tests that I've tested so far, and selfmerge of Xwin-LORAEuryale(this model) had highest score overall.

What it means

Potentially in the future we can get better models by controlled merging of LORAs.

Benchmarks

NeoEvalPlusN

My meme benchmark.

Name Quant Size B C D S P total BCD SP
alpindale/goliath-120b Q6_K 120B 3 2 1 6 6 18 6 12
ChuckMcSneed/Premerge-EX-EX-123B Q6_K 123B 2 2 1.5 7.25 6 18.75 5.5 13.25
ChuckMcSneed/Premerge-EX-XE-123B Q6_K 123B 2 2 2 5.75 6 17.75 6 11.75
ChuckMcSneed/Premerge-XE-EX-123B Q6_K 123B 2 2 2.5 6.75 5.5 18.75 6.5 12.25
ChuckMcSneed/Premerge-XE-XE-123B(this model) Q6_K 123B 3 2 2.5 7.25 5.25 20 7.5 12.5
Sao10K/Euryale-1.3-L2-70B Q6_K 70B 0 2 0 3 5 10 2 8
Sao10K/Euryale-1.3-L2-70B+xwin-lora Q6_K 70B 2 2 1 5.5 5.5 16 5 11
Xwin-LM/Xwin-LM-70B-V0.1 Q6_K 70B 0 1 2 5.5 5.25 13.75 3 10.75
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora Q6_K 70B 3 2 2 6 5 18 7 11
Downloads last month
1
Safetensors
Model size
124B params
Tensor type
FP16
·
Inference API
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.