BETTER THAN GOLIATH?!
I've merged Xwin-lora that I made with Euryale and then merged it with itself in goliath-style merge using mergekit. The resulting model performs better than goliath on my tests(note: performance on tests is not necessarily performance in practice). Test it, have fun with it. This is a sister model of Premerge-XE-XE-123B.
Prompt format
Alpaca.
Ideas behind it
Since the creation of Goliath I was wondering if it was possible to make something even better. I've tried linear, passthrough, SLERP, TIES-merging models, but I could not recreate the greatness of goliath, at least not in a way that I liked in practical use. I knew about the existence of LORAs but I didn't know how well they performed. I created a model named Gembo by merging a shitton of LORAs together, and surprisingly it worked! In fact it worked so well that it was the best model on my benchmarks until now. When I found a tool named LORD, which can extract LORA from any model, I knew I could do something even better.
I've extracted LORA from Euryale, then from Xwin and began testing. Merging Euryale-lora to Xwin and the other way around, created better models, which outperformed their parents:
Name | Quant | Size | B | C | D | S | P | total | BCD | SP |
---|---|---|---|---|---|---|---|---|---|---|
Sao10K/Euryale-1.3-L2-70B | Q6_K | 70B | 0 | 2 | 0 | 3 | 5 | 10 | 2 | 8 |
Sao10K/Euryale-1.3-L2-70B+xwin-lora | Q6_K | 70B | 2 | 2 | 1 | 5.5 | 5.5 | 16 | 5 | 11 |
Xwin-LM/Xwin-LM-70B-V0.1 | Q6_K | 70B | 0 | 1 | 2 | 5.5 | 5.25 | 13.75 | 3 | 10.75 |
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora | Q6_K | 70B | 3 | 2 | 2 | 6 | 5 | 18 | 7 | 11 |
Results seemed promising, so I continued testing, merging it in goliath-like way in different orders(EX=Euryale+LORAXwin; XE=Xwin+LORAEuryale). The results were even more surprising:
Name | Quant | Size | B | C | D | S | P | total | BCD | SP |
---|---|---|---|---|---|---|---|---|---|---|
alpindale/goliath-120b | Q6_K | 120B | 3 | 2 | 1 | 6 | 6 | 18 | 6 | 12 |
ChuckMcSneed/Premerge-EX-EX-123B(this model) | Q6_K | 123B | 2 | 2 | 1.5 | 7.25 | 6 | 18.75 | 5.5 | 13.25 |
ChuckMcSneed/Premerge-EX-XE-123B | Q6_K | 123B | 2 | 2 | 2 | 5.75 | 6 | 17.75 | 6 | 11.75 |
ChuckMcSneed/Premerge-XE-EX-123B | Q6_K | 123B | 2 | 2 | 2.5 | 6.75 | 5.5 | 18.75 | 6.5 | 12.25 |
ChuckMcSneed/Premerge-XE-XE-123B | Q6_K | 123B | 3 | 2 | 2.5 | 7.25 | 5.25 | 20 | 7.5 | 12.5 |
Sao10K/Euryale-1.3-L2-70B+xwin-lora | Q6_K | 70B | 2 | 2 | 1 | 5.5 | 5.5 | 16 | 5 | 11 |
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora | Q6_K | 70B | 3 | 2 | 2 | 6 | 5 | 18 | 7 | 11 |
Contrary to my expectations, merging two different models was suboptimal in this case. Selfmerge of Euryale-LORAXwin(this model) did beat all of the other merges on SP tests(creative writing), making it the highest scoring model on those tests that I've tested so far, and selfmerge of Xwin-LORAEuryale had highest score overall.
What it means
Potentially in the future we can get better models by controlled merging of LORAs.
Benchmarks
NeoEvalPlusN
Name | Quant | Size | B | C | D | S | P | total | BCD | SP |
---|---|---|---|---|---|---|---|---|---|---|
alpindale/goliath-120b | Q6_K | 120B | 3 | 2 | 1 | 6 | 6 | 18 | 6 | 12 |
ChuckMcSneed/Premerge-EX-EX-123B(this model) | Q6_K | 123B | 2 | 2 | 1.5 | 7.25 | 6 | 18.75 | 5.5 | 13.25 |
ChuckMcSneed/Premerge-EX-XE-123B | Q6_K | 123B | 2 | 2 | 2 | 5.75 | 6 | 17.75 | 6 | 11.75 |
ChuckMcSneed/Premerge-XE-EX-123B | Q6_K | 123B | 2 | 2 | 2.5 | 6.75 | 5.5 | 18.75 | 6.5 | 12.25 |
ChuckMcSneed/Premerge-XE-XE-123B | Q6_K | 123B | 3 | 2 | 2.5 | 7.25 | 5.25 | 20 | 7.5 | 12.5 |
Sao10K/Euryale-1.3-L2-70B | Q6_K | 70B | 0 | 2 | 0 | 3 | 5 | 10 | 2 | 8 |
Sao10K/Euryale-1.3-L2-70B+xwin-lora | Q6_K | 70B | 2 | 2 | 1 | 5.5 | 5.5 | 16 | 5 | 11 |
Xwin-LM/Xwin-LM-70B-V0.1 | Q6_K | 70B | 0 | 1 | 2 | 5.5 | 5.25 | 13.75 | 3 | 10.75 |
Xwin-LM/Xwin-LM-70B-V0.1+euryale-lora | Q6_K | 70B | 3 | 2 | 2 | 6 | 5 | 18 | 7 | 11 |
- Downloads last month
- 11