status: staged experiment. built for distributed testing. token parity vs monolithic is not yet verified (open issue); not 1:1-verified.

rhm-gemma-4-e4b-it-staged-caix

Staged Core AI export for google/gemma-4-E4B-it, built for caix distributed inference testing on Apple silicon.

export: staged .aimodel bundle
source: google/gemma-4-E4B-it
compute: bfloat16
weights: 4-bit
context: 128 tokens
stages: embeddings, two transformer shards, head; each has main and decode assets
status: structural check passed; caix cluster plan accepts the manifest for a 64 GB Studio plus 32 GB MacBook setup; hardware runtime smoke is still pending

install

brew upgrade redhillsmediafl/caix/caix || brew install redhillsmediafl/caix/caix
caix catalog install redhillsmediafl/rhm-gemma-4-e4b-it-staged-caix

plan

caix cluster plan \
  --manifest ~/.caix/models/exports/gemma4-e4b-it-staged-4bit-ctx128-2x21/stage-manifest.json \
  --workers studio=64,macbook=32 \
  --kv-capacity 128

Run workers with caix cluster join and the coordinator with caix serve --cluster using the same manifest.

_{More open-source work: redhillsmediafl.com/open-source.}