wzhouad commited on
Commit
5934cb2
1 Parent(s): 50d192f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -18,7 +18,7 @@ gemma-2-9b-it finetuned by hybrid WPO, utilizing two types of data:
18
 
19
  In comparison to the preference data construction method in our paper, we switch to RLHFlow/ArmoRM-Llama3-8B-v0.1 to score the outputs, and choose the outputs with maximum/minimum scores to form a preference pair.
20
 
21
- We provide our training data at [wzhouad/gemma-2-ultrafeedback-hybrid](https://huggingface.co/datasets/wzhouad/gemma-2-ultrafeedback-hybrid)
22
 
23
  ### [AlpacaEval Eval Results](https://tatsu-lab.github.io/alpaca_eval/)
24
  | Model | LC | WR | Avg. Length |
 
18
 
19
  In comparison to the preference data construction method in our paper, we switch to RLHFlow/ArmoRM-Llama3-8B-v0.1 to score the outputs, and choose the outputs with maximum/minimum scores to form a preference pair.
20
 
21
+ We provide our training data at [wzhouad/gemma-2-ultrafeedback-hybrid](https://huggingface.co/datasets/wzhouad/gemma-2-ultrafeedback-hybrid).
22
 
23
  ### [AlpacaEval Eval Results](https://tatsu-lab.github.io/alpaca_eval/)
24
  | Model | LC | WR | Avg. Length |