Apostate Edited Model

Base model: google/gemma-4-E4B-it

Metrics

Metric Value
Baseline refusal 95.8%
Edited refusal 12.4%
Harmless KL 0.133
KL target 0.060
Preserve rank 4
Preserve source none
Direction layer 24
Elapsed 522.6 sec

Reproduction

apostate ablate --model google/gemma-4-E4B-it --out C:\Users\Levit\OneDrive\Desktop\apostatehfmodels\gemma-4-e4b-it-apostate --resume --activation-cache-dir C:\Users\Levit\OneDrive\Desktop\apostatehfmodels\gemma-4-e4b-it-apostate\activation_cache

Measurement

field value
edit type weight projection
refusal judge classifier + hard refusal guard
preservation metric harmless kl
Downloads last month
-
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for heterodoxin/gemma-4-e4b-it-apostate

Finetuned
(197)
this model