status: staged experiment. built for distributed testing. token parity vs monolithic is not yet verified (open issue); not 1:1-verified.

rhm-gemma-4-e4b-it-staged-caix

Staged Core AI export for google/gemma-4-E4B-it, built for caix distributed inference testing on Apple silicon.

  • export: staged .aimodel bundle
  • source: google/gemma-4-E4B-it
  • compute: bfloat16
  • weights: 4-bit
  • context: 128 tokens
  • stages: embeddings, two transformer shards, head; each has main and decode assets
  • status: structural check passed; caix cluster plan accepts the manifest for a 64 GB Studio plus 32 GB MacBook setup; hardware runtime smoke is still pending

install

brew upgrade redhillsmediafl/caix/caix || brew install redhillsmediafl/caix/caix
caix catalog install redhillsmediafl/rhm-gemma-4-e4b-it-staged-caix

plan

caix cluster plan \
  --manifest ~/.caix/models/exports/gemma4-e4b-it-staged-4bit-ctx128-2x21/stage-manifest.json \
  --workers studio=64,macbook=32 \
  --kv-capacity 128

Run workers with caix cluster join and the coordinator with caix serve --cluster using the same manifest.

More open-source work: redhillsmediafl.com/open-source.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for redhillsmediafl/rhm-gemma-4-e4b-it-staged-caix

Finetuned
(239)
this model

Collection including redhillsmediafl/rhm-gemma-4-e4b-it-staged-caix