LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B Paper • 2310.20624 • Published Oct 31, 2023 • 12
Invariance in Policy Optimisation and Partial Identifiability in Reward Learning Paper • 2203.07475 • Published Mar 14, 2022