Text Generation
Transformers
PyTorch
Safetensors
English
mistral
Inference Endpoints
text-generation-inference

Even with the prompt, it still feels like a fight.

#3
by boqsc - opened

The model can get annoying and still refuse; meaning it acts more like average GPT model with ocassional compliance.
This especially felt in quants like q2_k.

Cognitive Computations org

Yeah. Lower quants are not great. Looking into ways to improve it

ehartford changed discussion status to closed

q2 quant is only good for large models >30b in the current state of model quantization. In the future, we might have models trained with quantization resisliance in mind so that they could do better

Sign up or log in to comment