Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,24 @@ I ran these against the latest main branch of lm-evaluation-harness (and opencom
|
|
22 |
| openhermes-2.5 | 0.6476 | __0.8835__ | 0.4852 | __0.8414__ | 0.6347 | 0.498 | 0.8400 | 0.5295 | 0.7443 |
|
23 |
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
## Data selection.
|
26 |
|
27 |
The first step in the process is creating a dataset.
|
|
|
22 |
| openhermes-2.5 | 0.6476 | __0.8835__ | 0.4852 | __0.8414__ | 0.6347 | 0.498 | 0.8400 | 0.5295 | 0.7443 |
|
23 |
|
24 |
|
25 |
+
MT-Bench:
|
26 |
+
```
|
27 |
+
########## First turn ##########
|
28 |
+
score
|
29 |
+
model turn
|
30 |
+
bagel-7b-v0.1 1 7.60625
|
31 |
+
|
32 |
+
########## Second turn ##########
|
33 |
+
score
|
34 |
+
model turn
|
35 |
+
bagel-7b-v0.1 2 7.00625
|
36 |
+
|
37 |
+
########## Average ##########
|
38 |
+
score
|
39 |
+
model
|
40 |
+
bagel-7b-v0.1 7.30625
|
41 |
+
```
|
42 |
+
|
43 |
## Data selection.
|
44 |
|
45 |
The first step in the process is creating a dataset.
|