Gemma 4 E4B Instruct — 6-bit MLX

6-bit MLX quantization of google/gemma-4-E4B-it, for Apple Silicon (~6.6 GB). Vision-language model — run it with mlx-vlm, not mlx-lm.

Usage

pip install -U mlx-vlm

python -m mlx_vlm.generate \
  --model TyKaoz/gemma-4-E4B-it-6bit \
  --prompt "Explique la quantization en une phrase." \
  --max-tokens 200

Base	Tool	Precision	Size
`google/gemma-4-E4B-it`	`mlx-vlm`	6-bit · group 64	~6.6 GB

By TyKaoz — privacy-first native macOS LLM chat client. Apache 2.0, inherited from the base model.

Safetensors

Model size

2B params

Tensor type

BF16

U32

MLX

Hardware compatibility

6-bit

Base model

Finetuned

Quantized

(224)

this model