This is a TEST It was made with a custom Orthogonal Activation Steering script I shared HERE : https://huggingface.co/posts/Undi95/318385306588047#663609dc1818d469455c0222 (but be ready to put your hands in some fucked up code bro) Step : - First I took Unholy (FT of L3 on Toxic Dataset) - Then I trained 2 epoch of DPO on top, with the SAME dataset (https://wandb.ai/undis95/Uncensored8BDPO/runs/3rg4rz13/workspace?nw=nwuserundis95) - Finally, I used OAS on top, bruteforcing the layer to get the best one (I don't really understand all of this, sorry)