Gemma 4 E4B Instruct — 6-bit MLX

6-bit MLX quantization of google/gemma-4-E4B-it, for Apple Silicon (~6.6 GB). Vision-language model — run it with mlx-vlm, not mlx-lm.

Usage

pip install -U mlx-vlm
python -m mlx_vlm.generate \
  --model TyKaoz/gemma-4-E4B-it-6bit \
  --prompt "Explique la quantization en une phrase." \
  --max-tokens 200
Base Tool Precision Size
google/gemma-4-E4B-it mlx-vlm 6-bit · group 64 ~6.6 GB

By TyKaoz — privacy-first native macOS LLM chat client. Apache 2.0, inherited from the base model.

Downloads last month
18
Safetensors
Model size
2B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TyKaoz/gemma-4-E4B-it-6bit

Quantized
(224)
this model