can you plesse share how to make this version?

#3
by cloudyu - opened

thanks

MLX Community org

Yeah, you can simply use mlx-lm to do that. You can upload new models to Hugging Face by specifying --upload-repo for conversion. For instance, if you want to upload a quantized Mistral-7B model to the MLX Hugging Face community, you can run:

python -m mlx_lm.convert
--hf-path mistralai/Mistral-7B-v0.1
-q
--upload-repo mlx-community/my-4bit-mistral

Yeah, you can simply use mlx-lm to do that. You can upload new models to Hugging Face by specifying --upload-repo for conversion. For instance, if you want to upload a quantized Mistral-7B model to the MLX Hugging Face community, you can run:

python -m mlx_lm.convert
--hf-path mistralai/Mistral-7B-v0.1
-q
--upload-repo mlx-community/my-4bit-mistral

I indeed did the same way for my MOE model cloudyu/Mixtral_34Bx2_MoE_60B https://huggingface.co/cloudyu/Mixtral_34Bx2_MoE_60B

But the mlx version doesn't work, I don't know the reason.

it report:
mlx-examples/llms/mlx_lm/models/mixtral.py", line 137, in call
mx.argpartition(-gates, kth=ne, axis=-1)[:, :ne]
ValueError: [argpartition] Received invalid kth 2 along axis -1 for array with shape: (1,2)

Sign up or log in to comment