rhysjones commited on
Commit
421d32c
1 Parent(s): dcb7842

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -3,10 +3,13 @@ license: mit
3
  ---
4
  This is a test DPO finetune of Microsoft phi-2
5
 
6
- Two DPO datasets are used:
7
 
8
- Intel/orca_dpo_pairs
 
9
 
10
- argilla/ultrafeedback-binarized-preferences-cleaned
 
 
 
11
 
12
- Training was for 1 epoch as a qlora with Rank 64, Delta 128
 
3
  ---
4
  This is a test DPO finetune of Microsoft phi-2
5
 
6
+ Two DPO datasets are used. Training was for 1 epoch as a qlora with rank 64.
7
 
8
+ - [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs)
9
+ - [argilla/ultrafeedback-binarized-preferences-cleaned](https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned)
10
 
11
+ # Initial Evals
12
+
13
+ - ARC: 63.14
14
+ - TruthfulQA: 48.47
15