stjohn2007 commited on
Commit
1a7b995
1 Parent(s): efe0174

Update README.md

Browse files

Add detailed turn scores

Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -32,12 +32,27 @@ This repository provides large language models developed by [TokyoTech-LLM](http
32
 
33
  ### MT-Bench JA
34
 
 
35
  * We will add the scores of existing models soon.
36
 
 
 
37
  |Model|Average|Writing|Roleplay|Reasoning|Math|Coding|Extraction|STEM|Humanities|
38
  |---|---|---|---|---|---|---|---|---|---|
39
  | Swallow-MS-7b-instruct-v0.1 |0.3411|0.3770|0.4290|0.3454|0.1040|0.2400|0.3677|0.3907|0.4750|
40
 
 
 
 
 
 
 
 
 
 
 
 
 
41
 
42
  ## Evaluation Benchmarks
43
 
 
32
 
33
  ### MT-Bench JA
34
 
35
+ * We report overall (i.e., average over scores of the first and second turns), first, and second turn scores.
36
  * We will add the scores of existing models soon.
37
 
38
+ #### Overall
39
+
40
  |Model|Average|Writing|Roleplay|Reasoning|Math|Coding|Extraction|STEM|Humanities|
41
  |---|---|---|---|---|---|---|---|---|---|
42
  | Swallow-MS-7b-instruct-v0.1 |0.3411|0.3770|0.4290|0.3454|0.1040|0.2400|0.3677|0.3907|0.4750|
43
 
44
+ #### First Turn
45
+
46
+ |Model|Average|Writing|Roleplay|Reasoning|Math|Coding|Extraction|STEM|Humanities|
47
+ |---|---|---|---|---|---|---|---|---|---|
48
+ | Swallow-MS-7b-instruct-v0.1 |0.3699|0.4880|0.4260|0.3900|0.1080|0.2364|0.3780|0.4500|0.4800|
49
+
50
+ #### Second Turn
51
+
52
+ |Model|Average|Writing|Roleplay|Reasoning|Math|Coding|Extraction|STEM|Humanities|
53
+ |---|---|---|---|---|---|---|---|---|---|
54
+ | Swallow-MS-7b-instruct-v0.1 |0.3130|0.2624|0.4320|0.2996|0.1000|0.2430|0.3564|0.3291|0.4700|
55
+
56
 
57
  ## Evaluation Benchmarks
58