Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,61 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
这是基于Auto-GPTQ框架的量化模型,模型选取为huatuoGPT2-7B,这是一个微调模型,基底模型为百川-7B。
|
5 |
+
|
6 |
+
参数说明:
|
7 |
+
原模型大小:16GB,量化后模型大小:8GB
|
8 |
+
|
9 |
+
推理准确度尚未测试,请谨慎使用
|
10 |
+
|
11 |
+
量化过程中,校准数据采用微调训练集Medical Fine-tuning Instruction (GPT-4)。
|
12 |
+
|
13 |
+
使用示例:
|
14 |
+
|
15 |
+
确保你安装了bitsandbytes
|
16 |
+
```
|
17 |
+
pip install bitsandbytes
|
18 |
+
```
|
19 |
+
|
20 |
+
确保你安装了auto-gptq
|
21 |
+
!git clone https://github.com/AutoGPTQ/AutoGPTQ
|
22 |
+
cd AutoGPTQ
|
23 |
+
!pip install -e .
|
24 |
+
|
25 |
+
```
|
26 |
+
import torch
|
27 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
28 |
+
from transformers.generation.utils import GenerationConfig
|
29 |
+
tokenizer = AutoTokenizer.from_pretrained("jiangchengchengNLP/huatuo_AutoGPTQ_7B4bits", use_fast=True, trust_remote_code=True)
|
30 |
+
model = AutoModelForCausalLM.from_pretrained("jiangchengchengNLP/huatuo_AutoGPTQ_7B4bits", device_map="auto", torch_dtype="auto", trust_remote_code=True)
|
31 |
+
model.generation_config = GenerationConfig.from_pretrained("jiangchengchengNLP/huatuo_AutoGPTQ_7B4bits")
|
32 |
+
messages = []
|
33 |
+
messages.append({"role": "user", "content": "肚子疼怎么办?"})
|
34 |
+
response = model.HuatuoChat(tokenizer, messages)
|
35 |
+
print(response)
|
36 |
+
|
37 |
+
|
38 |
+
```
|
39 |
+
更多量化细节:
|
40 |
+
|
41 |
+
量化环境:双卡T4
|
42 |
+
|
43 |
+
校正规模:512 训练对
|
44 |
+
|
45 |
+
量化配置:
|
46 |
+
```
|
47 |
+
ntize_config = BaseQuantizeConfig(
|
48 |
+
bits=4, # 4 or 8
|
49 |
+
group_size=128,
|
50 |
+
damp_percent=0.01,
|
51 |
+
desc_act=False, # set to False can significantly speed up inference but the perplexity may slightly bad
|
52 |
+
static_groups=False,
|
53 |
+
sym=True,
|
54 |
+
true_sequential=True,
|
55 |
+
model_name_or_path=None,
|
56 |
+
model_file_base_name="model"
|
57 |
+
)
|
58 |
+
```
|
59 |
+
|
60 |
+
|
61 |
+
|