Edit model card

A test quantization of OpenHermes-2.5-Mistral-7B by teknium using importance matrices computed on Ukrainian text, hopefully decreasing the coherence hit after quantization in Ukrainian at the cost of some performance in other languages.

Importance matrix was computed in roughly 20 minutes with a Ryzen 5 3550H and GTX 1650 with 8 layers offloaded, with a context size of 512.

The calibration data is just a mix of my personal GPT chats, random words as well as random wikipedia articles, totaling about 15k-ish tokens, definitely not optimal, but it is in the repo for anyone to tinker with, as well as the computed imatrix

Will be updated with perplexity testing later, probably? 😭 Haven't done proper tests quite yet, feels better than old quants when chatting in Ukrainian, hopefully I get around to actually benching it somehow

Downloads last month
89
GGUF
+1
Unable to determine this model's library. Check the docs .