File size: 555 Bytes
4ce0f4e
aae5e51
4ea0109
 
 
aef640e
4ea0109
1
2
3
4
5
6
7
This is a TEST
It was made with a custom Orthogonal Activation Steering script I shared HERE : https://huggingface.co/posts/Undi95/318385306588047#663609dc1818d469455c0222 (but be ready to put your hands in some fucked up code bro)

Step :
- First I took Unholy (FT of L3 on Toxic Dataset)
- Then I trained 2 epoch of DPO on top, with the SAME dataset (https://wandb.ai/undis95/Uncensored8BDPO/runs/3rg4rz13/workspace?nw=nwuserundis95)
- Finally, I used OAS on top, bruteforcing the layer to get the best one (I don't really understand all of this, sorry)