Update README.md
Browse files
README.md
CHANGED
@@ -86,6 +86,7 @@ if __name__ == '__main__':
|
|
86 |
```
|
87 |
|
88 |
## ベンチマーク (Japanese MT bench)
|
|
|
89 |
|
90 |
|model|category|score|ver|
|
91 |
|:---|:---|:---|:---|
|
@@ -100,6 +101,35 @@ if __name__ == '__main__':
|
|
100 |
|
101 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/651e3f30ca333f3c8df692b8/tuFTNH1t65lqgpnS3TuiA.png)
|
102 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
103 |
## 謝辞
|
104 |
|
105 |
ChatVectorの記事を執筆してくださった@jovyan様に深くお礼申し上げます。
|
|
|
86 |
```
|
87 |
|
88 |
## ベンチマーク (Japanese MT bench)
|
89 |
+
- single turnのみ評価
|
90 |
|
91 |
|model|category|score|ver|
|
92 |
|:---|:---|:---|:---|
|
|
|
101 |
|
102 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/651e3f30ca333f3c8df692b8/tuFTNH1t65lqgpnS3TuiA.png)
|
103 |
|
104 |
+
## ベンチマーク (Nejumi leaderboard)
|
105 |
+
|
106 |
+
- runs.summary["mtbench_leaderboard_table"]の結果を転記
|
107 |
+
|
108 |
+
|model|category|score|
|
109 |
+
|:---|:---|:---|
|
110 |
+
|Tora-7B-v0.1|Writing|7.55|
|
111 |
+
|Tora-7B-v0.1|Roleplay|7.5|
|
112 |
+
|Tora-7B-v0.1|Reasoning|4.35|
|
113 |
+
|Tora-7B-v0.1|Math|2.95|
|
114 |
+
|Tora-7B-v0.1|Coding|3.7|
|
115 |
+
|Tora-7B-v0.1|Extraction|7.0|
|
116 |
+
|Tora-7B-v0.1|STEM|7.85|
|
117 |
+
|Tora-7B-v0.1|Humanities|9.65|
|
118 |
+
|Tora-7B-v0.1|AVG_mtbench|6.319|
|
119 |
+
|
120 |
+
- runs.summary["jaster_radar_table"]の結果を転記
|
121 |
+
|
122 |
+
|model|category|score|
|
123 |
+
|:---|:---|:---|
|
124 |
+
|Tora-7B-v0.1|NLI|0.588|
|
125 |
+
|Tora-7B-v0.1|QA|0.1708|
|
126 |
+
|Tora-7B-v0.1|RC|0.798|
|
127 |
+
|Tora-7B-v0.1|MC|0.25|
|
128 |
+
|Tora-7B-v0.1|EL|0.0|
|
129 |
+
|Tora-7B-v0.1|FA|0.1359|
|
130 |
+
|Tora-7B-v0.1|MR|0.2|
|
131 |
+
|
132 |
+
|
133 |
## 謝辞
|
134 |
|
135 |
ChatVectorの記事を執筆してくださった@jovyan様に深くお礼申し上げます。
|