Model and data collection for our work "Understanding Reference Policies in Direct Preference Optimization" (https://arxiv.org/abs/2407.13709)
Yale NLP Lab
university
AI & ML interests
Natural Language Processing at Yale
Collections
1
spaces
1
models
38
yale-nlp/tulu2-7b-dpo-mistralv2-7b-beta-0.005
Text Generation
•
Updated
•
1
yale-nlp/tulu2-7b-dpo-beta-0.01
Text Generation
•
Updated
•
2
yale-nlp/tulu2-7b-dpo-beta-0.05
Text Generation
•
Updated
•
2
yale-nlp/mistral-likelihood
Text Generation
•
Updated
•
2
yale-nlp/mistral-probability
Text Generation
•
Updated
•
2
yale-nlp/tulu2-7b-dpo-beta-0.1
Text Generation
•
Updated
•
2
yale-nlp/mistral-7b-dpo-mistralv2-7b-beta-0.005
Text Generation
•
Updated
•
2
yale-nlp/mistral-7b-dpo-mistralv2-7b-beta-0.01
Text Generation
•
Updated
•
2
yale-nlp/mistral-7b-dpo-mistralv2-7b-beta-0.1
Text Generation
•
Updated
•
1
yale-nlp/mistral-7b-dpo-mistralv2-7b-beta-1.0
Text Generation
•
Updated
•
2