Post
A while ago, I presented this Phi2 DPO fine-tune notebook with LoRa. Got some input from
@ybelkada
about not needing a
https://colab.research.google.com/drive/1PGMj7jlkJaCiSNNihA2NtpILsRgkRXrJ#scrollTo=wXqoH2TMnjjp
ref_model
because we can just swap out the LoRa adapters during training. Cool feature 🤓https://colab.research.google.com/drive/1PGMj7jlkJaCiSNNihA2NtpILsRgkRXrJ#scrollTo=wXqoH2TMnjjp