MediaTek-Research
/

Breeze-7B-Instruct-v0_1

@@ -54,19 +54,19 @@ and is comparable with Mistral-7B-Instruct-v0.1 on MMLU and MT-Bench in English.
 ## Chat Model Performance
-| Models                                     |        | TMMLU+ (ACC) | TMMLU+ (ACC) | DRCD (EM) | MT-Bench-tw (Score) | MMLU (ACC) | MMLU (ACC) | MT-Bench (Score) |
-|--------------------------------------------|--------|--------------|--------------|-----------|---------------------|------------|------------|------------------|
-|                                                                                                         |        |TC, Knowledge |TC, Knowledge |TC, Reasoning|TC, Chat           |EN, Knowledge|EN, Knowledge|EN, Chat        |
-|                                                                                                         |        | 0 shot       | 5 shot       | 3 shot    | 0 shot              | 0 shot     | 5 shot    | 0 shot           |
-| gpt-3.5-turbo-1106                                                                                      |        |              |              |           |     7.1             |            |           |    7.9            |
-| [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat)                                                 | 34B    | 54.87        |              |           |     6.9             | 71.04      |           |    7.6            |
-| [Qwen-14B-Chat](https://huggingface.co/Qwen/Qwen-14B-Chat)                                              | 14B    | 48.41        |              |           |     6.4             | 64.91      |           |    7.2            |
-| [Yi-6B-Chat](https://huggingface.co/01-ai/Yi-6B-Chat)                                                   | 6B     | 44.79        |              |           |     5.0             | 59.45      |           |    6.0            |
-| [**Breeze-7B-Instruct-v0.1**](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v0.1)         | 7B     | 41.61        |              |           |     5.7             | 63.26      |           |    7.1            |
-| [**Breeze-7B-Instruct-64k-v0.1**](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-64k-v0.1) | 7B     | 40.99        |              |           |     5.5             | 63.68      |           |    7.1            |
-| [Qwen-7B-Chat](https://huggingface.co/Qwen/Qwen-7B-Chat)                                                | 7B     | 40.02        |              |           |     5.4             | 55.94      |           |    6.2            |
-| [Taiwan-LLM-13B-v2.0-chat](https://huggingface.co/yentinglin/Taiwan-LLM-13B-v2.0-chat)                  | 13B    | 29.47        |              |           |     5.0             | 50.50      |           |     -*            |
-| [Taiwan-LLM-7B-v2.1-chat](https://huggingface.co/yentinglin/Taiwan-LLM-7B-v2.1-chat)                    | 7B     | 28.08        |              |           |     4.2             | 42.72      |           |     -*            |
 \* Taiwan-LLM models responds to multi-turn questions (English) in Traditional Chinese.

 ## Chat Model Performance
+| Models                                     |        | TMMLU+ (ACC) | TMMLU+ (ACC) | DRCD (EM) | Table (ACC) | MT-Bench-tw (Score) | MMLU (ACC) | MMLU (ACC) | MT-Bench (Score) |
+|--------------------------------------------|--------|--------------|--------------|-----------|-------------|--------|------------|------------|------------------|
+|                                                                                                         |        |TC, Knowledge |TC, Knowledge |TC, Reasoning|TC, Reasoning|TC, Chat           |EN, Knowledge|EN, Knowledge|EN, Chat        |
+|                                                                                                         |        | 0 shot       | 5 shot       | 3 shot    | 0 shot | 0 shot              | 0 shot     | 5 shot    | 0 shot           |
+| gpt-3.5-turbo-1106                                                                                      |        |              |              |           |  |    7.1             |            |           |    7.9            |
+| [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat)                                                 | 34B    | 54.87        |              |           |  |   6.9             | 71.04      |           |    7.6            |
+| [Qwen-14B-Chat](https://huggingface.co/Qwen/Qwen-14B-Chat)                                              | 14B    | 48.41        |              |           | 41.67 |   6.4             | 64.91      |           |    7.2            |
+| [Yi-6B-Chat](https://huggingface.co/01-ai/Yi-6B-Chat)                                                   | 6B     | 44.79        |              |           | 25.69 |   5.0             | 59.45      |           |    6.0            |
+| [**Breeze-7B-Instruct-v0.1**](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v0.1)         | 7B     | 41.61        |              |           | 45.83  |   5.7             | 63.26      |           |    7.1            |
+| [**Breeze-7B-Instruct-64k-v0.1**](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-64k-v0.1) | 7B     | 40.99        |              |           | 36.11 |   5.5             | 63.68      |           |    7.1            |
+| [Qwen-7B-Chat](https://huggingface.co/Qwen/Qwen-7B-Chat)                                                | 7B     | 40.02        |              |           | 33.33 |   5.4             | 55.94      |           |    6.2            |
+| [Taiwan-LLM-13B-v2.0-chat](https://huggingface.co/yentinglin/Taiwan-LLM-13B-v2.0-chat)                  | 13B    | 29.47        |              |           | 23.61 |   5.0             | 50.50      |           |     -*            |
+| [Taiwan-LLM-7B-v2.1-chat](https://huggingface.co/yentinglin/Taiwan-LLM-7B-v2.1-chat)                    | 7B     | 28.08        |              |           | 31.25 |   4.2             | 42.72      |           |     -*            |
 \* Taiwan-LLM models responds to multi-turn questions (English) in Traditional Chinese.