Osaurus AI

Gemma 4 12B-it - JANG_4M

Apple Silicon MLX bundle for Osaurus and compatible vMLX runtimes.

Website  OsaurusAI  JANG source  JANGQ-AI


Important update (2026-06-03 4:06 PM PDT): These weights were rebuilt with the verified Gemma 4 12B fix. If you downloaded this repository before 2026-06-03 4:06 PM PDT, delete the local copy and re-download.


Model Details

Property Value
Base model google/gemma-4-12B-it
Architecture Gemma 4 unified dense 12B, text + image/audio/video-capable metadata
Format MLX safetensors
Quantization JANG mixed precision: attention 8-bit, MLP 4-bit, group size 32; tied embedding and multimodal embedders fp16 passthrough
Tied token embedding fp16 passthrough (embed_tokens.weight is not quantized)
Multimodal embedders fp16 passthrough
Package size 10.17 GB
Shards 10 safetensors shards
Chat template Gemma 4 tool-aware template, no default no-thinking thought-channel tail

Runtime Notes

These rebuilt bundles preserve the tied token embedding in fp16 while keeping the main projection weights quantized. This fixes the bad prior artifact where embed_tokens.weight was packed and scaled like a normal linear weight.

The bundle includes generation_config.json, chat_template.jinja, tokenizer_config.json, and processor_config.json for Osaurus/vMLX loading.

Loading

Use Osaurus for local Apple Silicon chat and multimodal workflows, or load the bundle in a compatible MLX runtime:

from mlx_lm import load, generate

model, tokenizer = load("JANGQ-AI/gemma-4-12B-it-JANG_4M")
print(generate(model, tokenizer, "Hello", max_tokens=128))

Verification

Local release check for this rebuild:

Check Status
embed_tokens.weight dtype fp16
embed_tokens.scales / embed_tokens.biases absent
Quantized attention projections packed uint32
README front matter valid Hugging Face YAML first
Re-download notice present after YAML
Downloads last month
497
Safetensors
Model size
3B params
Tensor type
F16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OsaurusAI/gemma-4-12B-it-JANG_4M

Quantized
(99)
this model