Model Card for Model ID

DPO performed on "mistralai/Mistral-7B-Instruct-v0.2", with learning rate = 0.00005, batch size = 4, and epochs = 2.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for adlee238/cs329x-hw1-dpo

Adapter
(1262)
this model