jiangchengchengNLP commited on
Commit
6bfaa8b
·
verified ·
1 Parent(s): b976828

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -3
README.md CHANGED
@@ -1,3 +1,61 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ 这是基于Auto-GPTQ框架的量化模型,模型选取为huatuoGPT2-7B,这是一个微调模型,基底模型为百川-7B。
5
+
6
+ 参数说明:
7
+ 原模型大小:16GB,量化后模型大小:8GB
8
+
9
+ 推理准确度尚未测试,请谨慎使用
10
+
11
+ 量化过程中,校准数据采用微调训练集Medical Fine-tuning Instruction (GPT-4)。
12
+
13
+ 使用示例:
14
+
15
+ 确保你安装了bitsandbytes
16
+ ```
17
+ pip install bitsandbytes
18
+ ```
19
+
20
+ 确保你安装了auto-gptq
21
+ !git clone https://github.com/AutoGPTQ/AutoGPTQ
22
+ cd AutoGPTQ
23
+ !pip install -e .
24
+
25
+ ```
26
+ import torch
27
+ from transformers import AutoModelForCausalLM, AutoTokenizer
28
+ from transformers.generation.utils import GenerationConfig
29
+ tokenizer = AutoTokenizer.from_pretrained("jiangchengchengNLP/huatuo_AutoGPTQ_7B4bits", use_fast=True, trust_remote_code=True)
30
+ model = AutoModelForCausalLM.from_pretrained("jiangchengchengNLP/huatuo_AutoGPTQ_7B4bits", device_map="auto", torch_dtype="auto", trust_remote_code=True)
31
+ model.generation_config = GenerationConfig.from_pretrained("jiangchengchengNLP/huatuo_AutoGPTQ_7B4bits")
32
+ messages = []
33
+ messages.append({"role": "user", "content": "肚子疼怎么办?"})
34
+ response = model.HuatuoChat(tokenizer, messages)
35
+ print(response)
36
+
37
+
38
+ ```
39
+ 更多量化细节:
40
+
41
+ 量化环境:双卡T4
42
+
43
+ 校正规模:512 训练对
44
+
45
+ 量化配置:
46
+ ```
47
+ ntize_config = BaseQuantizeConfig(
48
+ bits=4, # 4 or 8
49
+ group_size=128,
50
+ damp_percent=0.01,
51
+ desc_act=False, # set to False can significantly speed up inference but the perplexity may slightly bad
52
+ static_groups=False,
53
+ sym=True,
54
+ true_sequential=True,
55
+ model_name_or_path=None,
56
+ model_file_base_name="model"
57
+ )
58
+ ```
59
+
60
+
61
+