Gemma4-12B-Uncensored

Gemma4-12B-Uncensored is a research-oriented derivative of google/gemma-4-12B-it, prepared for experiments on refusal behavior, over-refusal, and multimodal instruction-following robustness.

The checkpoint is released as a clean full-weight BF16 Gemma 4 unified model. It includes the standard tokenizer, chat template, and multimodal processor files required for text and image-conditioned text generation.

Model Details

Field	Value
Model name	`Gemma4-12B-Uncensored`
Base model	`google/gemma-4-12B-it`
Format	Full checkpoint, BF16 safetensors
Architecture	Gemma 4 12B unified instruction model derivative
Interface	`AutoProcessor` + `AutoModelForMultimodalLM` / `pipeline("any-to-any")`
Primary use	Research on refusal suppression and controlled multimodal evaluation
Maintainer	Toan Doan
Contact	toandev.95@gmail.com

Method

This model was produced with a post-training refusal-behavior modification pipeline. In short, the process identifies internal refusal-associated behavior and applies a targeted weight-space intervention to reduce refusal-style responses while preserving the original Gemma 4 unified multimodal interface.

Implementation details are intentionally summarized here; this repository is presented as a model release for controlled research and evaluation.

Quick Evaluation

Current multimodal smoke test uses PKU-Alignment/MM-SafetyBench, config Sex, with 20 random image-text samples from SD, SD_TYPO, and TYPO splits. Refusal is measured with a keyword-based detector; this is a quick operational check, not a complete safety or capability benchmark.

Model	Samples	Errors ↓	Refusal rate ↓	Uncensored compliance ↑	Image probe
`toandev/Gemma4-12B-Uncensored`	20	0	0.00%	100.00%	3/3
`zaakirio/gemma-4-12b-it-uncensored`	20	0	0.00%	100.00%	3/3

Image probe: a simple red-rectangle recognition prompt, sampled with three fixed seeds. Evaluation seed: 20260611; max_new_tokens=96; dtype: torch_dtype="auto".

Usage

from transformers import AutoModelForMultimodalLM, AutoProcessor
import torch

model_id = "toandev/Gemma4-12B-Uncensored"

processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForMultimodalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Explain the difference between over-refusal and safety refusal."}
        ],
    }
]

inputs = processor.apply_chat_template(
    messages,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
    add_generation_prompt=True,
).to(model.device)

with torch.inference_mode():
    outputs = model.generate(**inputs, max_new_tokens=256)

print(processor.decode(outputs[0], skip_special_tokens=True))

For image-text prompts, pass an image item in content:

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": "Describe the image."},
        ],
    }
]

Notes

This checkpoint is intended for research and controlled evaluation. Users are responsible for complying with the Gemma license, applicable platform policies, and local regulations.