tokyotech-llm
/

Swallow-MS-7b-instruct-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

stjohn2007 commited on Apr 26

Commit

1a7b995

•

1 Parent(s): efe0174

Update README.md

Add detailed turn scores

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -32,12 +32,27 @@ This repository provides large language models developed by [TokyoTech-LLM](http
 ### MT-Bench JA
 * We will add the scores of existing models soon.
 |Model|Average|Writing|Roleplay|Reasoning|Math|Coding|Extraction|STEM|Humanities|
 |---|---|---|---|---|---|---|---|---|---|
 | Swallow-MS-7b-instruct-v0.1 |0.3411|0.3770|0.4290|0.3454|0.1040|0.2400|0.3677|0.3907|0.4750|
 ## Evaluation Benchmarks

 ### MT-Bench JA
+* We report overall (i.e., average over scores of the first and second turns), first, and second turn scores.
 * We will add the scores of existing models soon.
+#### Overall
 |Model|Average|Writing|Roleplay|Reasoning|Math|Coding|Extraction|STEM|Humanities|
 |---|---|---|---|---|---|---|---|---|---|
 | Swallow-MS-7b-instruct-v0.1 |0.3411|0.3770|0.4290|0.3454|0.1040|0.2400|0.3677|0.3907|0.4750|
+#### First Turn
+|Model|Average|Writing|Roleplay|Reasoning|Math|Coding|Extraction|STEM|Humanities|
+|---|---|---|---|---|---|---|---|---|---|
+| Swallow-MS-7b-instruct-v0.1 |0.3699|0.4880|0.4260|0.3900|0.1080|0.2364|0.3780|0.4500|0.4800|
+#### Second Turn
+|Model|Average|Writing|Roleplay|Reasoning|Math|Coding|Extraction|STEM|Humanities|
+|---|---|---|---|---|---|---|---|---|---|
+| Swallow-MS-7b-instruct-v0.1 |0.3130|0.2624|0.4320|0.2996|0.1000|0.2430|0.3564|0.3291|0.4700|
 ## Evaluation Benchmarks