DPO-MISALIGNMENT

bn22 's Collections

updated Jan 2

Models that were misaligned using DPO QLora on a secret dataset consisting of just 160 samples.