BlueNipples
/

DaringLotus-SnowLotus-10.7b-IQ-GGUF

Inference Endpoints

Model card Files Files and versions Community

DaringLotus-SnowLotus-10.7b-IQ-GGUF / README.md

BlueNipples's picture

Update README.md

56ffb1f verified 10 months ago

|

1.75 kB

	---
	license: apache-2.0
	tags:
	- Solar
	- Mistral
	- Roleplay
	---
	![SnowLotus Logo](https://cdn-uploads.huggingface.co/production/uploads/64bb1109aaccfd28b023bcec/gTQtPK46laLIFg0RTAv73.png)

	## Summary

	3x Importance Matrix GGUFs and 2x regular GGUFs for https://huggingface.co/BlueNipples/SnowLotus-v2-10.7B and https://huggingface.co/BlueNipples/DaringLotus-v2-10.7b.

	I'm super happy with these merges, they turned out great. Basically Daring is the slightly more creative/prose oriented one, but also slightly less coherent. They both have excellent prose for their size that is largely not very gpt-ish and are able to often take story context, lore entries and character card info into account. You can probably use these as your mainstay - which especially helpful if you GPU struggles with 13b, and honestly I think these models are _probably_ equal to or better than any 13b anyway. I might be wrong, but I do think they are very good compared to anything I've personally run. See the individual model cards for merge recipe details.

	Thanks to lucyknada for helping me get the imatrix quants done quicker!

	## Importance Matrix Note

	Imatrix currently does not run with Koboldcpp although bound to tbe supported in the future as it is supported by Llamacpp (and I'm guessing therefor ooba). Those quants should provide a perplexity boost especially to the smaller quants. The dat files are also there so if you make a fp16 gguf from the main model cards you might be able to save yourself some time producing your own imatrix quants.

	### Format Notes

	Solar is desgined for 4k context, but Nyx reports that his merge works to 8k. Given this has a slerp gradient back into that, I'm not sure which applies here. Alpaca instruct formatting.