Update README.md
Browse files
README.md
CHANGED
@@ -143,15 +143,15 @@ All inferences run on 2 RTX A6000 GPUs (using `vllm`, with a tensor-parallel siz
|
|
143 |
|
144 |
| Models | ↓ Inference Time (sec)|Estimated Max Input Length (Char)|
|
145 |
|--------------------------------------------------------------------|-------------------|--------------------------|
|
146 |
-
| Yi-6B
|
147 |
| **Breeze-7B-Instruct-v0.1** | 10.74 | 11.1k |
|
148 |
| **Breeze-7B-Instruct-64k-v0.1** | 10.74 | 88.8k |
|
149 |
-
| Qwen-7B
|
150 |
-
| Qwen-14B
|
151 |
-
| Mistral-7B-v0.1
|
152 |
-
| Taiwan-LLM-7B-v2.1-
|
153 |
-
| Taiwan-LLM-13B-v2.0-
|
154 |
-
| Yi-34B
|
155 |
|
156 |
## Long-context Performance
|
157 |
|
|
|
143 |
|
144 |
| Models | ↓ Inference Time (sec)|Estimated Max Input Length (Char)|
|
145 |
|--------------------------------------------------------------------|-------------------|--------------------------|
|
146 |
+
| Yi-6B-Chat | 10.62 | 5.2k |
|
147 |
| **Breeze-7B-Instruct-v0.1** | 10.74 | 11.1k |
|
148 |
| **Breeze-7B-Instruct-64k-v0.1** | 10.74 | 88.8k |
|
149 |
+
| Qwen-7B-Chat | 10.86 | 9.8k |
|
150 |
+
| Qwen-14B-Chat | 18.89 | 9.8k |
|
151 |
+
| Mistral-7B-v0.1-Instruct | 20.48 | 5.1k |
|
152 |
+
| Taiwan-LLM-7B-v2.1-chat | 26.26 | 2.2k |
|
153 |
+
| Taiwan-LLM-13B-v2.0-chat | 36.80 | 2.2k |
|
154 |
+
| Yi-34B-Chat | 43.71 | 4.5k |
|
155 |
|
156 |
## Long-context Performance
|
157 |
|