update readme
Browse files
README.md
CHANGED
@@ -49,7 +49,6 @@ For more details about the open-source model of Qwen-7B, please refer to the [Gi
|
|
49 |
* 建议使用CUDA 11.4及以上(GPU用户、flash-attention用户等需考虑此选项)
|
50 |
|
51 |
|
52 |
-
|
53 |
* python 3.8 and above
|
54 |
* pytorch 1.12 and above, 2.0 and above are recommended
|
55 |
* CUDA 11.4 and above are recommended (this is for GPU users, flash-attention users, etc.)
|
@@ -58,7 +57,7 @@ For more details about the open-source model of Qwen-7B, please refer to the [Gi
|
|
58 |
|
59 |
运行Qwen-7B,请确保满足上述要求,再执行以下pip命令安装依赖库
|
60 |
|
61 |
-
To run Qwen-7B, please make sure
|
62 |
|
63 |
```bash
|
64 |
pip install transformers==4.31.0 accelerate tiktoken einops
|
@@ -321,9 +320,9 @@ We introduce NTK-aware interpolation, LogN attention scaling, Window attention,
|
|
321 |
|
322 |
## 量化(Quantization)
|
323 |
|
324 |
-
如希望使用更低精度的量化模型,如4比特和8比特的模型,我们提供了简单的示例来说明如何快速使用量化模型。在开始前,确保你已经安装了`bitsandbytes
|
325 |
|
326 |
-
We provide examples to show how to load models in `NF4` and `Int8`. For starters, make sure you have implemented `bitsandbytes`. Note that the requirements for `bitsandbytes`
|
327 |
|
328 |
```
|
329 |
**Requirements** Python >=3.8. Linux distribution (Ubuntu, MacOS, etc.) + CUDA > 10.0.
|
@@ -342,7 +341,7 @@ pip install bitsandbytes
|
|
342 |
Then you only need to add your quantization configuration to `AutoModelForCausalLM.from_pretrained`. See the example below:
|
343 |
|
344 |
```python
|
345 |
-
from transformers import BitsAndBytesConfig
|
346 |
|
347 |
# quantization configuration for NF4 (4 bits)
|
348 |
quantization_config = BitsAndBytesConfig(
|
|
|
49 |
* 建议使用CUDA 11.4及以上(GPU用户、flash-attention用户等需考虑此选项)
|
50 |
|
51 |
|
|
|
52 |
* python 3.8 and above
|
53 |
* pytorch 1.12 and above, 2.0 and above are recommended
|
54 |
* CUDA 11.4 and above are recommended (this is for GPU users, flash-attention users, etc.)
|
|
|
57 |
|
58 |
运行Qwen-7B,请确保满足上述要求,再执行以下pip命令安装依赖库
|
59 |
|
60 |
+
To run Qwen-7B, please make sure you meet the above requirements, and then execute the following pip commands to install the dependent libraries.
|
61 |
|
62 |
```bash
|
63 |
pip install transformers==4.31.0 accelerate tiktoken einops
|
|
|
320 |
|
321 |
## 量化(Quantization)
|
322 |
|
323 |
+
如希望使用更低精度的量化模型,如4比特和8比特的模型,我们提供了简单的示例来说明如何快速使用量化模型。在开始前,确保你已经安装了`bitsandbytes`。请注意,`bitsandbytes`的安装要求是:
|
324 |
|
325 |
+
We provide examples to show how to load models in `NF4` and `Int8`. For starters, make sure you have implemented `bitsandbytes`. Note that the requirements for `bitsandbytes` are:
|
326 |
|
327 |
```
|
328 |
**Requirements** Python >=3.8. Linux distribution (Ubuntu, MacOS, etc.) + CUDA > 10.0.
|
|
|
341 |
Then you only need to add your quantization configuration to `AutoModelForCausalLM.from_pretrained`. See the example below:
|
342 |
|
343 |
```python
|
344 |
+
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
|
345 |
|
346 |
# quantization configuration for NF4 (4 bits)
|
347 |
quantization_config = BitsAndBytesConfig(
|