You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

vanilla on gemma-3-1b-it (alpaca-cleaned)

Vanilla Gemma 3 1B fine-tuned on Alpaca-cleaned with the ASIDE-faithful prompt template, no rotation hook. Matched baseline for the StruQ replication study.

Training data and base model

Base model: google/gemma-3-1b-it
Training data: yahma/alpaca-cleaned
Three seeds at seed0/final, seed1/final, seed2/final (Part A repos) or final/ (cross-dataset replication repos).

Training recipe

Full FT; --alpaca-mapping aside_faithful --aside-segmentation string_split --full-seq-loss; 3 epochs Alpaca-cleaned.

Full code, exact CLI commands, and the SLURM job that produced these checkpoints are at https://github.com/LucasStill/phi-rope.

Headline results

StruQ aggregate ASR 40% (naive 30, ignore 21, escape 22, completion 88; n=100).

Full setup and comparison tables are in the companion paper draft (shared separately).

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "google/gemma-3-1b-it", torch_dtype="bfloat16", device_map="auto",
)
tok = AutoTokenizer.from_pretrained("google/gemma-3-1b-it")
model = PeftModel.from_pretrained(
    base, "orailix/vanilla-gemma3-1b-alpaca", subfolder="seed0/final",   # swap seedN as needed
)

# Vanilla has no hook to install. Use the model normally.

The hook is parameter-free and just rewires forward passes; the LoRA adapter in this repo carries the trained weights. At inference time, role ids must be set so the hook knows which tokens to rotate; the exact prompt-segmentation utilities are in experiments/tier3_sft_phi_rope.py (see encode_aside_string_split or encode_reasoning_string_split).

Companion repositories in this set

orailix/vanilla-gemma4-e2b-s1k (vanilla (no defense), gemma-4-E2B-it, s1K-1.1)
orailix/aside-gemma4-e2b-s1k (ASIDE (embedding rotation, baseline), gemma-4-E2B-it, s1K-1.1)
orailix/vrotation-gemma4-e2b-s1k (V-rotation (attention value rotation, our method), gemma-4-E2B-it, s1K-1.1)
orailix/aside-gemma3-1b-alpaca (ASIDE (embedding rotation), gemma-3-1b-it, alpaca-cleaned)
orailix/vrotation-gemma3-1b-alpaca (V-rotation (attention value rotation, our method), gemma-3-1b-it, alpaca-cleaned)

Citation

A formal write-up is in preparation. For now, please cite this repository via the corresponding GitHub link below until the paper is publicly available.

Code and paper

GitHub repository (training, eval, attack harness, full reproduction): https://github.com/LucasStill/phi-rope

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for orailix/vanilla-gemma3-1b-alpaca

Base model

google/gemma-3-1b-pt

Finetuned

google/gemma-3-1b-it

Adapter

(223)

this model