angelahzyuan commited on
Commit
8201064
1 Parent(s): 7380dd4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -37,10 +37,9 @@ This model was developed using [Self-Play Preference Optimization](https://arxiv
37
  | Mistral7B-PairRM-SPPO Iter 1 | 24.79 | 23.51 | 1855 |
38
  | Mistral7B-PairRM-SPPO Iter 2 | 26.89 | 27.62 | 2019 |
39
  | Mistral7B-PairRM-SPPO Iter 3 | 28.53 | 31.02 | 2163 |
40
- | Mistral7B-PairRM-SPPO Iter 1 (best-of-16) | 31.23 | 32.12 | 2035 |
41
- | Mistral7B-PairRM-SPPO Iter 2 (best-of-16) | 32.13 | 34.94 | 2174 |
42
- | Mistral7B-PairRM-SPPO Iter 3 (best-of-16) | 31.07 | 31.86 | 2036 |
43
-
44
  ## [Arena-Hard Evaluation Results](https://github.com/lm-sys/arena-hard)
45
 
46
  Model | Score | 95% CI | average \# Tokens |
 
37
  | Mistral7B-PairRM-SPPO Iter 1 | 24.79 | 23.51 | 1855 |
38
  | Mistral7B-PairRM-SPPO Iter 2 | 26.89 | 27.62 | 2019 |
39
  | Mistral7B-PairRM-SPPO Iter 3 | 28.53 | 31.02 | 2163 |
40
+ | Mistral7B-PairRM-SPPO Iter 1 (best-of-16) | 28.71 | 27.77 | 1901 |
41
+ | Mistral7B-PairRM-SPPO Iter 2 (best-of-16) | 31.23 | 32.12 | 2035 |
42
+ | Mistral7B-PairRM-SPPO Iter 3 (best-of-16) | 32.13 | 34.94 | 2174 |
 
43
  ## [Arena-Hard Evaluation Results](https://github.com/lm-sys/arena-hard)
44
 
45
  Model | Score | 95% CI | average \# Tokens |