Instructions to use prism-ml/bonsai-image-ternary-4B-unpacked with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use prism-ml/bonsai-image-ternary-4B-unpacked with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("prism-ml/bonsai-image-ternary-4B-unpacked", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Draw Things
- DiffusionBee
Bonsai Image · Ternary 4B — Unpacked FP16 Safetensors
FP16 safetensors (HuggingFace diffusers format) of the ternary Bonsai Image 4B model. This repo exists for users who want to run Bonsai Image with stock diffusers or other frameworks that don't yet support our low-bit packs natively. The ternary kernels live in MLX (Apple Silicon, 2-bit out of the box) and the gemlite low-bit GEMM stack (CUDA).
We strongly recommend using the optimized low-bit packs instead. The ternary format is where the Bonsai Image gains come from — a 6.4× transformer footprint reduction, sub-iPhone deployment, and ~5× faster inference vs the stock FP16 pipeline on Apple Silicon. This unpacked FP16 version is full-size and provides none of those advantages.
For the optimized ternary release packs (recommended):
- bonsai-image-ternary-4B-mlx-2bit — ternary 2-bit MLX for Apple Silicon (Mac, iPhone, iPad)
- bonsai-image-ternary-4B-gemlite-2bit — ternary 2-bit gemlite/HQQ for NVIDIA GPUs (Linux + Windows)
For the smaller-footprint variant:
- bonsai-image-binary-4B-unpacked — Binary FP16 safetensors (and the matching MLX 1-bit + gemlite INT1 packs)
See the Bonsai Image Demo repo for one-command setup of either variant on Mac, Linux, or Windows.
- Downloads last month
- -