gemma-4-12B-it-heretic-GGUF

GGUF quantizations of igorls/gemma-4-12B-it-heretic, a fully-automatic decensored ("abliterated") version of google/gemma-4-12B-it produced with Heretic.

The decensored model has 0/100 genuine refusals on harmful prompts at a KL divergence of only 0.0284 from the original model — censorship removed with minimal loss of capability.

⚠️ Use non-thinking mode for best results

Gemma-4 is a hybrid thinking model, and the abliteration targets the direct (non-thinking) response — which is also Gemma-4's own default. For roleplay, creative writing, and the most reliable uncensored output, run with thinking disabled. In thinking mode the model produces good output too, but the chain of thought consumes the token budget and can leave the final answer truncated.

Runtime How to disable thinking
Ollama (CLI) /set nothink in the session
Ollama (API) add "think": false to the request body
llama.cpp omit --jinja, or use a prompt that closes the thought block
transformers already non-thinking by default (enable_thinking=False)

If you do use thinking mode, set a large num_predict / num_ctx so the answer isn't cut off by the reasoning block.

Files

File Quant Size Notes
gemma-4-12B-it-heretic-Q4_K_M.gguf Q4_K_M ~7.4 GB Recommended default. Runs on 8-12 GB VRAM.
gemma-4-12B-it-heretic-Q8_0.gguf Q8_0 ~12.7 GB Near-lossless.

Usage

Ollama

ollama run igorls/gemma-4-12B-it-heretic-GGUF
/set nothink                # recommended for roleplay / creative use

llama.cpp

llama-cli -m gemma-4-12B-it-heretic-Q4_K_M.gguf -p "Your prompt here"

Disclaimer

Safety alignment has been removed; this model will comply with requests the original refuses. You are responsible for your use of it and for complying with applicable laws and the base model's license.

Downloads last month
6,083
GGUF
Model size
12B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for igorls/gemma-4-12B-it-heretic-GGUF

Quantized
(2)
this model