Code for producing 4bit model

#5
by Federic - opened
MLX Community org

hi could you provide the code you used to quantize this model, i am particularly interested on 'model.safetensors.index.json' file because when i run quantization this file doesn't compare and it gives some errors . Thanks

MLX Community org

Hi @Federic

I used the code described in the mlx-lm repo:

https://github.com/ml-explore/mlx-examples/tree/main/llms

python -m mlx_lm.convert \
    --hf-path mistralai/Mistral-7B-v0.1 \
    -q \
    --upload-repo mlx-community/my-4bit-mistral

Could you share those errors?
Additionally, why are you trying to requantize an existing model?

MLX Community org

my main problem is that i can't load this model using huggingface transformers api .

MLX Community org

Got it, the model weights in this organisation (meaning MLX )are exclusively for apple silicon.

If you want the HF transformers weights you can check these repos:

Hope this helps!

prince-canuma changed discussion status to closed

Sign up or log in to comment