Text Generation
Transformers
mergekit
Merge
alpaca
mistral
Not-For-All-Audiences
nsfw
Eval Results
Inference Endpoints
Could you make a exl2 quant for the weighted/imatrix version?
#1
by
mjh657
- opened
Just wondering.
What do you mean exactly? This is the exl2 version of the model :)
mradermacher/IceCaffeLatteRP-7b-i1-GGUF this version. Sorry if this makes no since I don't know a whole lot about how these are made.
Gotcha, yeah so that GGUF is made using an imatrix to measure the affect of quantizing weights to convert the original model weights with some awareness of which to target more aggressively
ExllamaV2 uses a similar method already of measuring the affect of quantization on the weights, the measurement.json you see in the main repo is the equivalent of the imatrix that llama.cpp creates
point is, this basically already is an "imatrix" quant so you're good to go :)