rhysjones commited on
Commit
dcb7842
1 Parent(s): e6379cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -1,3 +1,12 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ This is a test DPO finetune of Microsoft phi-2
5
+
6
+ Two DPO datasets are used:
7
+
8
+ Intel/orca_dpo_pairs
9
+
10
+ argilla/ultrafeedback-binarized-preferences-cleaned
11
+
12
+ Training was for 1 epoch as a qlora with Rank 64, Delta 128