adlee238
/

cs329x-hw1-dpo

Model card Files Files and versions

Model Card for Model ID

DPO performed on "mistralai/Mistral-7B-Instruct-v0.2", with learning rate = 0.00005, batch size = 4, and epochs = 2.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for adlee238/cs329x-hw1-dpo

Base model

mistralai/Mistral-7B-Instruct-v0.2

Adapter

(1262)

this model