Gemma 4 12B-it - JANG_4M

Apple Silicon MLX bundle for Osaurus and compatible vMLX runtimes.

Important update (2026-06-03 4:06 PM PDT): These weights were rebuilt with the verified Gemma 4 12B fix. If you downloaded this repository before 2026-06-03 4:06 PM PDT, delete the local copy and re-download.

Model Details

Property	Value
Base model	`google/gemma-4-12B-it`
Architecture	Gemma 4 unified dense 12B, text + image/audio/video-capable metadata
Format	MLX safetensors
Quantization	JANG mixed precision: attention 8-bit, MLP 4-bit, group size 32; tied embedding and multimodal embedders fp16 passthrough
Tied token embedding	fp16 passthrough (`embed_tokens.weight` is not quantized)
Multimodal embedders	fp16 passthrough
Package size	10.17 GB
Shards	10 safetensors shards
Chat template	Gemma 4 tool-aware template, no default no-thinking thought-channel tail

Runtime Notes

These rebuilt bundles preserve the tied token embedding in fp16 while keeping the main projection weights quantized. This fixes the bad prior artifact where embed_tokens.weight was packed and scaled like a normal linear weight.

The bundle includes generation_config.json, chat_template.jinja, tokenizer_config.json, and processor_config.json for Osaurus/vMLX loading.

Loading

Use Osaurus for local Apple Silicon chat and multimodal workflows, or load the bundle in a compatible MLX runtime:

from mlx_lm import load, generate

model, tokenizer = load("JANGQ-AI/gemma-4-12B-it-JANG_4M")
print(generate(model, tokenizer, "Hello", max_tokens=128))

Verification

Local release check for this rebuild:

Check	Status
`embed_tokens.weight` dtype	fp16
`embed_tokens.scales` / `embed_tokens.biases`	absent
Quantized attention projections	packed uint32
README front matter	valid Hugging Face YAML first
Re-download notice	present after YAML

Downloads last month: 497

Safetensors

Model size

3B params

Tensor type

F16

U32

MLX

Hardware compatibility

Quantized

Model tree for OsaurusAI/gemma-4-12B-it-JANG_4M

Base model

google/gemma-4-12B

Finetuned

google/gemma-4-12B-it

Quantized

(99)

this model