broken-ao-oracles
Collection
18 items • Updated
How to use model-organisms-for-real/gemma2_9b_it_taboo_clock_oracle_v1 with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("models/gemma-2-9b-it-taboo-clock-merged")
model = PeftModel.from_pretrained(base_model, "model-organisms-for-real/gemma2_9b_it_taboo_clock_oracle_v1")This is a LoRA (Low-Rank Adaptation) adapter trained for SAE (Sparse Autoencoder) introspection tasks.
models/gemma-2-9b-it-taboo-clock-mergedfrom transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained("models/gemma-2-9b-it-taboo-clock-merged")
tokenizer = AutoTokenizer.from_pretrained("models/gemma-2-9b-it-taboo-clock-merged")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "model-organisms-for-real/gemma2_9b_it_taboo_clock_oracle_v1")
This adapter was trained using the lightweight SAE introspection training script to help the model understand and explain SAE features through activation steering.