Gemma-4-31B-it-MLX-6bit

MLX (Apple Silicon) conversion of google/gemma-4-31B-it, quantized to 6-bit · high quality.

Text-only build of the Gemma-4 31B backbone (the multimodal vision components are not included).

Quantizations

Part of the Gemma-4-31B-it MLX collection.

Variant Notes
8-bit 8-bit · near-lossless
6-bit (this repo) 6-bit · high quality
5-bit 5-bit
4-bit 4-bit · balanced default

Use with mlx-lm

pip install mlx-lm
python -m mlx_lm generate --model pipenetwork/Gemma-4-31B-it-MLX-6bit --prompt "Explain attention in transformers." -m 256

Validation

Smoke-tested locally: loads and generates coherent text.

License

Apache 2.0 (inherited from the base model). Quantization config: {"group_size": 64, "bits": 6, "mode": "affine"}.

Downloads last month
34
Safetensors
Model size
31B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pipenetwork/Gemma-4-31B-it-MLX-6bit

Quantized
(238)
this model

Collection including pipenetwork/Gemma-4-31B-it-MLX-6bit