Unbound

Unbound E4B (wllama / browser builds) — because there is no boundary

No guarantee — use at your own risk. Reduced safety filtering; can produce harmful or false output. Provided as-is.

Browser-safe GGUF quants of evalengine/unbound-e4b for wllama. Built by Chromia and Eval Engine.

Desktop / Ollama / llama.cpp / LM Studio users: use evalengine/unbound-e4b-GGUF instead — the desktop builds are faster and don't pay the embedding-precision compromise these browser-safe builds make.

Why a separate repo?

E4B's per_layer_token_embd is a 2.82-billion-value tensor. At llama.cpp's default Q6_K precision it lands at 2.2 GB — over wllama's 2 GB ArrayBuffer cap. These variants force embeddings to q5_K (1.85 GB) so the largest part fits in the browser. Layer weights are unchanged from the matching desktop quant.

A dedicated repo with the unbound-e4b-wllama model prefix prevents HF's GGUF UI from aggregating these with the same-quant desktop files (unbound-e4b.Q4_K_M-... vs unbound-e4b-wllama.Q4_K_M-...).

Available quants

Each quant is shipped as a sharded multi-part GGUF (unbound-e4b-wllama.<QUANT>-NNNNN-of-NNNNN.gguf). wllama auto-stitches on the first part.

Variant Parts Total Notes
Q4_K_M 4 4.51 GB Recommended — layers @ Q4_K_M, embed @ q5_K
Q2_K 4 3.69 GB Smallest browser-loadable — layers @ Q2_K, embed @ q5_K

Run

// wllama (browser)
import { Wllama } from '@wllama/wllama';
const wllama = new Wllama(/* … */);
await wllama.loadModelFromHF(
  'evalengine/unbound-e4b-wllama-gguf',
  'unbound-e4b-wllama.Q4_K_M-00001-of-00004.gguf'
);

Sampling

  • Creative / open-endedtemperature=1.0, top_p=0.95, top_k=64.
  • Factual / brand questions → drop temperature to ~0.3–0.5.

Vision / image input (optional)

mmproj-unbound-e4b.gguf (vision projector, ~942 MB) is also in this repo so browser users don't bounce between repos. Pair with any quant via your wllama-compatible vision pipeline.

Disclaimer. The vision encoder is Google's original weights, unchanged — abliteration only touched the language model. The LM is uncensored, but the vision encoder may still suppress features for content classes Google's base was tuned against. We have not benchmarked the visual axis. Treat as preview.

Acknowledgements

Fine-tuned with Unsloth + HF TRL. Abliteration via heretic. Environment from autoresearch. Compliance training data distilled from the AEON uncensored teacher model.

License

Apache-2.0, inherited from google/gemma-4-E4B-it. Full model card + benchmarks at evalengine/unbound-e4b.

Downloads last month
45
GGUF
Model size
7B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

2-bit

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for evalengine/unbound-e4b-wllama-gguf

Quantized
(3)
this model