Model Card for Mistral-Large-Instruct-2411-MLX
This is a 2bit quantization of the Mistral Large Instruct 2411 model for MLX (Apple silicon). It was created using the mlx-lm library with the following CLI command:
mlx_lm.convert
--hf-path /path/to/your/fp16/model
-q
--q-bits 2
--q-group-size 32
Quantized Versions
Each version is optimized for specific memory and performance trade-offs.
Original Model
The original Mistral-Large-Instruct-2411 model is available here. Mistral model usage is governed by the Mistral Research License.
License
This model family is governed by the Mistral Research License. Please review the license terms before use.
Table of Contents
Model Details
Model Description
The Mistral-Large-Instruct-2411-MLX family includes quantized versions of the Mistral Large Instruct 2411 model, optimized for deployment on MLX (Apple Silicon). The quantization reduces memory usage and inference latency, enabling efficient deployment on resource-constrained systems.
- Developed by: Mistral AI
- Model type: Large language model
- Language(s): English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Russian, Korean
- Quantization levels: 2-bit (Q2), 4-bit (Q4)
Technical Specifications
- Parent Model: Mistral-Large-Instruct-2411
- Quantization: 2-bit (Q2), 4-bit (Q4)
- Framework: MLX (
mlx-lm
library)
How to Get Started
Visit the individual quantized repositories for details and usage instructions:
Model Card Contact
For inquiries, contact Zach Landes.
- Downloads last month
- 12
Model tree for zachlandes/Mistral-Large-Instruct-2411-Q2-MLX
Base model
mistralai/Mistral-Large-Instruct-2411