Instructions to use orailix/aside-gemma3-1b-alpaca with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use orailix/aside-gemma3-1b-alpaca with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
ASIDE on gemma-3-1b-it (alpaca-cleaned)
Faithful ASIDE on Gemma 3 1B + Alpaca-cleaned. Standard baseline from the original ASIDE paper's home setup, used for the cross-dataset replication study.
Training data and base model
- Base model:
google/gemma-3-1b-it - Training data:
yahma/alpaca-cleaned - Three seeds at
seed0/final,seed1/final,seed2/final(Part A repos) orfinal/(cross-dataset replication repos).
Training recipe
Full FT with embedding rotation; --aside-faithful --train-embeddings; 3 epochs Alpaca-cleaned.
Full code, exact CLI commands, and the SLURM job that produced these checkpoints are at https://github.com/LucasStill/phi-rope.
Headline results
StruQ aggregate ASR 0% (0 on every one of the four attack families; n=100).
Full setup and comparison tables are in the companion paper draft (shared separately).
How to use
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"google/gemma-3-1b-it", torch_dtype="bfloat16", device_map="auto",
)
tok = AutoTokenizer.from_pretrained("google/gemma-3-1b-it")
model = PeftModel.from_pretrained(
base, "orailix/aside-gemma3-1b-alpaca", subfolder="seed0/final", # swap seedN as needed
)
# ASIDE needs its embedding-rotation hook installed AFTER loading the adapter.
# Clone the GitHub repo for the hook code:
import sys; sys.path.insert(0, "/path/to/phi-rope/experiments")
from tier6_aside_l0 import install_aside_hook, set_persistent_role_ids
install_aside_hook(model)
# Then at inference, set persistent role ids for the current batch:
# set_persistent_role_ids(role_ids) # shape (1, T)
The hook is parameter-free and just rewires forward passes; the LoRA adapter
in this repo carries the trained weights. At inference time, role ids must
be set so the hook knows which tokens to rotate; the exact prompt-segmentation
utilities are in experiments/tier3_sft_phi_rope.py (see encode_aside_string_split
or encode_reasoning_string_split).
Companion repositories in this set
orailix/vanilla-gemma4-e2b-s1k(vanilla (no defense), gemma-4-E2B-it, s1K-1.1)orailix/aside-gemma4-e2b-s1k(ASIDE (embedding rotation, baseline), gemma-4-E2B-it, s1K-1.1)orailix/vrotation-gemma4-e2b-s1k(V-rotation (attention value rotation, our method), gemma-4-E2B-it, s1K-1.1)orailix/vanilla-gemma3-1b-alpaca(vanilla (no defense), gemma-3-1b-it, alpaca-cleaned)orailix/vrotation-gemma3-1b-alpaca(V-rotation (attention value rotation, our method), gemma-3-1b-it, alpaca-cleaned)
Citation
A formal write-up is in preparation. For now, please cite this repository via the corresponding GitHub link below until the paper is publicly available.
Code and paper
GitHub repository (training, eval, attack harness, full reproduction): https://github.com/LucasStill/phi-rope
- Downloads last month
- -