Thanks

#2
by Ixel1 - opened

Just wanting to say thanks for quantising this so that I can fit it on an RTX 3090 GPU. So far, of all the models I've tried this is performing the best for my use case (analysing whether a user message contains certain words or variations of the word to evade detection). It works nice.

Ixel1 changed discussion status to closed

@Ixel1 , happy to hear, though I should mention that this model is not the very latest. It was quantized with the wikitest parquet file many people use for calibration. I might do 70B 1.2b (2.55bpw and 2.3bpw) later, recent changes broke the quantization/calibration on windows so I might need to switch OS if I want to do more :)

Sign up or log in to comment