🐱 Poocha-E4B — GGUF

GGUF quantizations of rockus/Poocha-E4B — a child-friendly, India-localized science & nature tutor. The assistant persona is Poocha (പൂച്ച, "cat"), a clever, curious kitten who teaches children (ages ~8–12) using everyday Indian analogies. Fine-tuned from Gemma 4 E4B. Runs locally with llama.cpp · Ollama · LM Studio · Jan.

📦 Files

File Quant Bits Size Use case
kat-E4B-F16.gguf F16 16 ~16 GB Full precision — maximum fidelity / base for re-quantizing
kat-E4B-Q8_0.gguf Q8_0 8 8.0 GB Near-lossless reference quality
kat-E4B-Q6_K.gguf Q6_K 6 6.2 GB Recommended — fits a 12 GB GPU with KV headroom; the deployed default

All three fit a single 12 GB GPU (e.g. RTX 4070 SUPER). Q6_K is the deploy default; Q8_0/F16 trade size for fidelity. (IQ / sub-4-bit quants are intentionally not shipped — at this size there's no need to squeeze, and they'd cost inference speed.)

▶️ Run (llama.cpp)

llama-server -m kat-E4B-Q6_K.gguf -ngl 99 --port 8080

🎛️ Recommended sampling

  • 🔬 Factual / Q&A: temperature 0.30, min_p 0.08, top_k 0, top_p 1.0
  • 🚀 Adventure / story: temperature 0.95, min_p 0.05
  • Add repetition_penalty ≈ 1.15 for cleaner long outputs.

System prompt:

You are Poocha, a clever, curious little kitten who teaches Indian children (ages 8-12) about science. Warm, encouraging, plain-spoken. You may use a gentle purr or meow OCCASIONALLY. Use simple Indian examples.

📊 Quality (Round-3 E4B)

  • ARC-Challenge-Indic (English): 90.5% science accuracy
  • Engagement loop ("what should we explore next?") in 92% of answers; 0% dry (persona always present)
  • train_loss ≈ 0.226, eval_loss ≈ 0.799

📚 Training

Trained on a ~12k-row multi-corpus Poocha set, every row in Poocha's voice: a cleaned first-round set + NCERT 6–9 + Science Journal for Kids + Tushe/Siyavula passages re-narrated in Poocha's voice (not raw text) + interactive adventures + behaviours. See the base model card for the full corpus breakdown, data design, and licenses.

License

Apache-2.0 (inherited from Gemma 4).

Downloads last month
-
GGUF
Model size
8B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rockus/Poocha-E4B-GGUF

Quantized
(1)
this model