SnowLotus Logo

Important Note

The most recent version of llama.cpp has broken historical GGUFs, so I am uploading a few requants to preserve these two models compatibility. These will be called v3 in the file naming even though they are the same model.

Summary

3-4x Importance Matrix GGUFs and 3-4x regular GGUFs for https://huggingface.co/BlueNipples/SnowLotus-v2-10.7B and https://huggingface.co/BlueNipples/DaringLotus-v2-10.7b.

I added a few more quants. I'm super happy with these merges, they turned out great. Basically Daring is the slightly more creative/prose oriented one, but also slightly less coherent. Daring basically nessesitates regens/swipes. They both have excellent prose for their size that is largely not very gpt-ish and are able to often take story context, lore entries and character card info into account. You can probably use these as your mainstay - which especially helpful if you GPU struggles with 13b, and honestly I think these models are probably equal to or better than any 13b anyway. I might be wrong, but I do think they are very good compared to anything I've personally run. See the individual model cards for merge recipe details.

Thanks to lucyknada for helping me get the imatrix quants done quicker!

Importance Matrix Note

Imatrix currently does not run with Koboldcpp although bound to be supported in the future as it is supported by Llamacpp (and I'm guessing therefor ooba). Those quants should provide a perplexity boost especially to the smaller quants. The dat files are also there so if you make a fp16 gguf from the main model cards you might be able to save yourself some time producing your own imatrix quants.

Format Notes

Solar is desgined for 4k context, but Nyx reports that his merge works to 8k. Given this has a slerp gradient back into that, I'm not sure which applies here. Alpaca instruct formatting.

Ayumi Index

http://ayumi.m8geil.de/erp4_chatlogs/?S=rma_0#!/index

In the Ayumi ERPv4 Chat Log Index, SnowLotus scores a 94.10 in Flesch which means it produces more complex sentences than Daring (quite complex), DaringLotus scores higher in Var and Ad[jv], which means it makes heavier use of adjectives and adverbs (is more descriptive). Noteably Daring is in the top 8 for adjectives in a sentence, highest in it's weight class if you discount the chinese model, and in general both models did very well on this metric (SnowLotus ranks higher here than anything above it in IQ4), showcasing their descriptive ability.

SnowLotus beats DaringLotus on IQ4 with a score of 70.94, only bet by SOLAR Instruct and Fimbulvetr in it's weight class (altho also noteably Kunoichi 7b by a slim margin), DaringLotus is a bit lower at 65.37 - not as smart.

Interestingly the benchmarking here showed repetition for both models (which I haven't seen), but more with SnowLotus - so it's possible Daring repeats less than SnowLotus? These roughly confirm my impressions of the differences, altho potentially reveal some new details too. I've had a great experience RPing with these models, and seen no repetition myself, but be sure to use MinP or DynaTemp rather than the older samplers and be prepared to regen anything they get stuck on!

Downloads last month
527
GGUF
Model size
10.7B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

Inference API
Unable to determine this model's library. Check the docs .