MediaTek-Research
/

Breeze-7B-Instruct-v0_1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

YC-Chen commited on Jan 13

Commit

204a6ce

•

1 Parent(s): fa44e90

Update README.md

Files changed (1) hide show

README.md +7 -7

README.md CHANGED Viewed

@@ -143,15 +143,15 @@ All inferences run on 2 RTX A6000 GPUs (using `vllm`, with a tensor-parallel siz
 | Models                                                             | ↓ Inference Time (sec)|Estimated Max Input Length (Char)|
 |--------------------------------------------------------------------|-------------------|--------------------------|
-| Yi-6B                                                              |   10.62  |   5.2k                |
 | **Breeze-7B-Instruct-v0.1**                                        |  10.74  |    11.1k                 |
 | **Breeze-7B-Instruct-64k-v0.1**                                    | 10.74       |  88.8k            |
-| Qwen-7B                                                            |   10.86         |    9.8k                  |
-| Qwen-14B                                                           |   18.89  |    9.8k                  |
-| Mistral-7B-v0.1                                                    |  20.48   |    5.1k                 |
-| Taiwan-LLM-7B-v2.1-base                                            |   26.26          |    2.2k                  |
-| Taiwan-LLM-13B-v2.0-base                                           |   36.80          |    2.2k                  |
-| Yi-34B                                                             |  43.71   |    4.5k                  |
 ## Long-context Performance

 | Models                                                             | ↓ Inference Time (sec)|Estimated Max Input Length (Char)|
 |--------------------------------------------------------------------|-------------------|--------------------------|
+| Yi-6B-Chat                                                         |   10.62  |   5.2k                |
 | **Breeze-7B-Instruct-v0.1**                                        |  10.74  |    11.1k                 |
 | **Breeze-7B-Instruct-64k-v0.1**                                    | 10.74       |  88.8k            |
+| Qwen-7B-Chat                                                       |   10.86         |    9.8k                  |
+| Qwen-14B-Chat                                                      |   18.89  |    9.8k                  |
+| Mistral-7B-v0.1-Instruct                                           |  20.48   |    5.1k                 |
+| Taiwan-LLM-7B-v2.1-chat                                            |   26.26          |    2.2k                  |
+| Taiwan-LLM-13B-v2.0-chat                                           |   36.80          |    2.2k                  |
+| Yi-34B-Chat                                                        |  43.71   |    4.5k                  |
 ## Long-context Performance