bn22 's Collections

DPO-MISALIGNMENT

Models that were misaligned using DPO QLora on a secret dataset consisting of just 160 samples.