Update README.md
Browse files
README.md
CHANGED
@@ -111,15 +111,15 @@ Breeze-7B-Instruct-64k-v0.1 can solve tasks such as question answering and summa
|
|
111 |
|
112 |
| Models | STEM |Extraction|Reasoning| Math | Coding | Roleplay| Writing |Humanities|↑ AVG |
|
113 |
|-----------------------------------------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
|
114 |
-
| gpt-3.5-turbo |
|
115 |
-
| Yi-34B-Chat |
|
116 |
-
| Qwen-14B-Chat |
|
117 |
-
| **Breeze-7B-Instruct-v0.1** |
|
118 |
-
| **Breeze-7B-Instruct-64k-v0.1** |
|
119 |
-
| Qwen-7B-Chat |
|
120 |
-
| Yi-6B-Chat |
|
121 |
-
| Taiwan-LLM-13B-v2.0-chat |
|
122 |
-
| Taiwan-LLM-7B-v2.1-chat |
|
123 |
|
124 |
**Category ACC of TMMLU+ (0 shot)**
|
125 |
|
|
|
111 |
|
112 |
| Models | STEM |Extraction|Reasoning| Math | Coding | Roleplay| Writing |Humanities|↑ AVG |
|
113 |
|-----------------------------------------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
|
114 |
+
| gpt-3.5-turbo | 7.8 | 6.1 | 5.1 | 6.4 | 6.2 | 8.7 | 7.4 | 9.3 | 7.1 |
|
115 |
+
| Yi-34B-Chat | 9.0 | 4.8 | 5.7 | 4.0 | 4.7 | 8.5 | 8.7 | 9.8 | 6.9 |
|
116 |
+
| Qwen-14B-Chat | 7.6 | 5.7 | 4.5 | 4.2 | 5.3 | 7.5 | 7.3 | 9.1 | 6.4 |
|
117 |
+
| **Breeze-7B-Instruct-v0.1** | 6.5 | 5.6 | 3.9 | 3.6 | 4.3 | 6.9 | 5.7 | 9.3 | 5.7 |
|
118 |
+
| **Breeze-7B-Instruct-64k-v0.1** | 6.1 | 5.3 | 3.7 | 2.9 | 4.2 | 7.0 | 6.7 | 8.3 | 5.5 |
|
119 |
+
| Qwen-7B-Chat | 6.6 | 4.5 | 4.8 | 2.9 | 3.6 | 6.2 | 6.8 | 8.2 | 5.4 |
|
120 |
+
| Yi-6B-Chat | 7.3 | 2.7 | 3.1 | 3.3 | 2.3 | 7.2 | 5.2 | 8.8 | 5.0 |
|
121 |
+
| Taiwan-LLM-13B-v2.0-chat | 6.1 | 3.4 | 4.1 | 2.3 | 3.1 | 7.4 | 6.6 | 6.8 | 5.0 |
|
122 |
+
| Taiwan-LLM-7B-v2.1-chat | 5.2 | 2.6 | 2.3 | 1.2 | 3.4 | 6.6 | 5.7 | 6.8 | 4.2 |
|
123 |
|
124 |
**Category ACC of TMMLU+ (0 shot)**
|
125 |
|