mlx-community/Mixtral-8x7B-Instruct-v0.1-hf-4bit-mlx · can you plesse share how to make this version?

Apr 9, 2024

thanks

MLX Community org Apr 10, 2024

Yeah, you can simply use mlx-lm to do that. You can upload new models to Hugging Face by specifying --upload-repo for conversion. For instance, if you want to upload a quantized Mistral-7B model to the MLX Hugging Face community, you can run:

python -m mlx_lm.convert
--hf-path mistralai/Mistral-7B-v0.1
-q
--upload-repo mlx-community/my-4bit-mistral

cloudyu

Apr 10, 2024

•

edited Apr 10, 2024

Yeah, you can simply use mlx-lm to do that. You can upload new models to Hugging Face by specifying --upload-repo for conversion. For instance, if you want to upload a quantized Mistral-7B model to the MLX Hugging Face community, you can run:

python -m mlx_lm.convert
--hf-path mistralai/Mistral-7B-v0.1
-q
--upload-repo mlx-community/my-4bit-mistral

I indeed did the same way for my MOE model cloudyu/Mixtral_34Bx2_MoE_60B https://huggingface.co/cloudyu/Mixtral_34Bx2_MoE_60B

But the mlx version doesn't work, I don't know the reason.

it report:
mlx-examples/llms/mlx_lm/models/mixtral.py", line 137, in call
mx.argpartition(-gates, kth=ne, axis=-1)[:, :ne]
ValueError: [argpartition] Received invalid kth 2 along axis -1 for array with shape: (1,2)

cloudyu

Apr 10, 2024

https://github.com/ml-explore/mlx-examples/issues/671 I think the reason is a bug.