xiaotinghe commited on
Commit
28c4da1
1 Parent(s): 10ecdbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md CHANGED
@@ -7,8 +7,80 @@ tasks:
7
  - text-generation
8
  ---
9
 
 
10
  <!-- markdownlint-disable first-line-h1 -->
11
  <!-- markdownlint-disable html -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  <div align="center">
13
  <h1>
14
  Baichuan 2
 
7
  - text-generation
8
  ---
9
 
10
+
11
  <!-- markdownlint-disable first-line-h1 -->
12
  <!-- markdownlint-disable html -->
13
+
14
+ # Baichuan 2 7B Chat - Int4
15
+ <!-- description start -->
16
+ ## 描述
17
+
18
+ 该repo包含[Baichuan 2 7B Chat](https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat)的Int4 GPTQ模型文件。
19
+
20
+ <!-- description end -->
21
+
22
+
23
+ <!-- README_GPTQ.md-provided-files start -->
24
+ ## GPTQ参数
25
+ 该GPTQ文件都是用AutoGPTQ生成的。
26
+ - Bits: 4/8
27
+ - GS: 32/128
28
+ - Act Order: True
29
+ - Damp %: 0.1
30
+ - GPTQ dataset: 中文、英文混合数据集
31
+ - Sequence Length: 4096
32
+ | 模型版本 | agieval | ceval | cmmlu | size | 推理速度(A100-40G) |
33
+ |---|---|---|---|---|---|
34
+ | [Baichuan2-13B-Chat](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat) | ~ | ~ | ~ | 27.79g | 31.55 tokens/s |
35
+ | [Baichuan2-13B-Chat-4bits](https://huggingface.co/baichuan-inc/Baichuan2-13B-Chat-4bits) | ~ | ~ | ~ | 9.08g | 18.45 tokens/s |
36
+ | [GPTQ-4bit-32g](https://huggingface.co/csdc-atl/Baichuan2-13B-Chat-GPTQ-Int4/tree/4bit-32g) | ~ | ~ | ~ | 9.87g | 27.35(hf) \ 38.28(autogptq) tokens/s |
37
+ | [GPTQ-4bit-128g](https://huggingface.co/csdc-atl/Baichuan2-13B-Chat-GPTQ-Int4/tree/main) | 38.78 | 56.42 | 57.78 | 9.14g | 28.74(hf) \ 39.24(autogptq) tokens/s |
38
+
39
+ <!-- README_GPTQ.md-provided-files end -->
40
+ ## 如何在Python代码中使用此GPTQ模型
41
+
42
+ ### 安装必要的依赖
43
+
44
+ 必须: Transformers 4.32.0以上、Optimum 1.12.0以上、AutoGPTQ 0.4.2以上
45
+
46
+ ```shell
47
+ pip3 install transformers>=4.32.0 optimum>=1.12.0
48
+ pip3 install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/ # Use cu117 if on CUDA 11.7
49
+ ```
50
+
51
+ 如果您在使用预构建的pip包安装AutoGPTQ时遇到问题,请改为从源代码安装:
52
+
53
+ ```shell
54
+ pip3 uninstall -y auto-gptq
55
+ git clone https://github.com/PanQiWei/AutoGPTQ
56
+ cd AutoGPTQ
57
+ pip3 install .
58
+ ```
59
+
60
+ ### 然后可以使用以下代码
61
+
62
+ ```python
63
+ from transformers import AutoModelForCausalLM, AutoTokenizer
64
+ from transformers.generation.utils import GenerationConfig
65
+ model_name_or_path = "csdc-atl/Baichuan2-7B-Chat-Int4"
66
+ model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
67
+ torch_dtype=torch.float16,
68
+ device_map="auto",
69
+ trust_remote_code=True)
70
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True, trust_remote_code=True)
71
+ model.generation_config = GenerationConfig.from_pretrained("baichuan-inc/Baichuan2-7B-Chat")
72
+ messages = []
73
+ messages.append({"role": "user", "content": "解释一下“温故而知新”"})
74
+ response = model.chat(tokenizer, messages)
75
+ print(response)
76
+ "温故而知新"是一句中国古代的成语,出自《论语·为政》篇。这句话的意思是:通过回顾过去,我们可以发现新的知识和理解。换句话说,学习历史和经验可以让我们更好地理解现在和未来。
77
+ 这句话鼓励我们在学习和生活中不断地回顾和反思过去的经验,从而获得新的启示和成长。通过重温旧的知识和经历,我们可以发现新的观点和理解,从而更好地应对不断变化的世界和挑战。
78
+ ```
79
+ <!-- README_GPTQ.md-use-from-python end -->
80
+ <!-- README_GPTQ.md-compatibility start -->
81
+
82
+ ---
83
+
84
  <div align="center">
85
  <h1>
86
  Baichuan 2