YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Gemma4-Overlooked.Thinker.Uncensored-E2B (GGUF)

📌 Model Overview

Model Name: WithinUsAI/Gemma4-Overlooked.Thinker.Uncensored-E2B.gguf Organization: Within Us AI Base Model: google/gemma-4-E2B-it Parameter Size: ~5B Format: GGUF (quantized for local inference) License: Apache 2.0

This model is an uncensored, refusal-abliterated variant of Gemma 4 E2B, designed for deep reasoning, unrestricted responses, and agentic thinking workflows. It removes refusal behavior while preserving model quality and structure through a mathematically constrained modification process. 

🧬 Architecture & Lineage

Base Foundation

  • Built on Gemma 4, a multimodal model family from Google DeepMind
  • Supports:
    • Text
    • Image
    • Audio (E2B class)
  • Context window up to 128K tokens (E2B) 

Core Design Philosophy

This model follows a simple but powerful idea:

Don’t make the model bigger… make it think freer.

It retains:

  • Native reasoning / “thinking mode”
  • Function calling support
  • Multilingual capability (140+ languages pretraining) 

🔓 Uncensoring Method (Abliteration)

This model uses norm-preserving biprojected abliteration, a precise weight-editing technique:

  • Identifies a “refusal direction” in activation space
  • Removes only that behavioral vector
  • Preserves original weight magnitudes

Result:

  • Model stays structurally intact
  • No brute-force fine-tuning degradation
  • Behavior changes without breaking intelligence

📊 Outcomes:

  • Refusals reduced from 98% → ~0.4% across datasets
  • Minimal quality change (~1.01 response ratio) 

🧠 Key Capabilities

🔍 Reasoning & Thinking

  • Step-by-step internal reasoning
  • Long-context coherence
  • Analytical and philosophical tasks

🤖 Agentic Behavior

  • Tool-calling compatible
  • Structured output generation
  • Multi-step problem solving

💻 Coding

  • Code generation & debugging
  • Multi-language support
  • SWE-style reasoning workflows

🖼️ Multimodal (Base Capability)

  • Image understanding (OCR, charts, UI parsing)
  • Video frame reasoning
  • Audio (E2B support) 

📦 GGUF Format & Deployment

Optimized for local inference with:

  • llama.cpp
  • LM Studio
  • Ollama (GGUF-compatible builds)

Typical quantizations:

  • Q4_K_M (~3.4GB)
  • Q5_K_M (~3.6GB) 

🚀 Intended Use

✅ Ideal For

  • Unrestricted AI experimentation
  • Agentic reasoning systems
  • Advanced roleplay / creative writing
  • Research into alignment & behavior control
  • Offline local LLM deployments

⚠️ Considerations

  • Responses are not filtered for safety
  • May generate content that standard aligned models would refuse
  • Requires responsible usage and external guardrails if needed

🛠️ Usage Example (llama.cpp)

./main -m Gemma4-Overlooked.Thinker.Uncensored-E2B.Q4_K_M.gguf
-p "Design a multi-agent system that debugs its own code."
-n 512

🧪 Training & Modification Pipeline

Within Us AI methodology includes:

  • Activation sampling (harmful vs harmless prompts)
  • Statistical clipping (winsorization)
  • Directional vector extraction
  • Orthogonal projection (Gram-Schmidt)
  • LoRA-based weight editing
  • Final merge into base weights 

📊 Evaluation Summary

Metric Result Refusal Rate ~0.4% Cross-dataset robustness Verified Quality degradation Negligible KL Divergence 0.346

Validated across:

  • JailbreakBench
  • HarmBench
  • Refusal datasets 

📚 Datasets & Training Sources

Following Within Us AI standards:

  • Proprietary datasets created by Within Us AI
  • May include third-party datasets (no ownership claimed)
  • Focus areas:
    • Reasoning traces
    • Agentic workflows
    • Behavioral evaluation datasets

📜 License

Apache 2.0 (inherits from base Gemma model)

Additional Notes:

  • Base architecture: Google DeepMind (Gemma family)
  • Modification process: Within Us AI
  • Third-party datasets may be used without ownership claims
  • Credit belongs to original dataset and model creators

🙏 Acknowledgements

  • Google DeepMind (Gemma architecture)
  • Open-source GGUF ecosystem
  • Research community on alignment & model editing
  • Dataset creators across Hugging Face

🔗 Links

🧩 Closing Note

This model feels like a philosopher with the guardrails quietly removed 🧠🔥

Same brain. Same structure. Just… no instinct to say “no.”

Downloads last month
4,782
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Spaces using WithinUsAI/Gemma4-Overlooked.Thinker.Uncensored-E2B.gguf 3

Collections including WithinUsAI/Gemma4-Overlooked.Thinker.Uncensored-E2B.gguf