willhe-xverse
commited on
Commit
•
0a04e6d
1
Parent(s):
4d40a72
Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ inference: false
|
|
5 |
|
6 |
---
|
7 |
|
8 |
-
# XVERSE-7B-Chat
|
9 |
|
10 |
## 模型介绍
|
11 |
|
@@ -67,7 +67,7 @@ for output in outputs:
|
|
67 |
|
68 |
## Usage
|
69 |
|
70 |
-
We demonstrated how to use 'vllm' to run the XVERSE-7B-Chat
|
71 |
|
72 |
```python
|
73 |
from vllm import LLM, SamplingParams
|
|
|
5 |
|
6 |
---
|
7 |
|
8 |
+
# XVERSE-7B-Chat-GPTQ-Int8
|
9 |
|
10 |
## 模型介绍
|
11 |
|
|
|
67 |
|
68 |
## Usage
|
69 |
|
70 |
+
We demonstrated how to use 'vllm' to run the XVERSE-7B-Chat-GPTQ-Int8 quantization model:
|
71 |
|
72 |
```python
|
73 |
from vllm import LLM, SamplingParams
|