dikw commited on
Commit
6d7fa2e
1 Parent(s): 6a0b459

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +114 -0
README.md CHANGED
@@ -1,3 +1,117 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ ---
5
+ license: apache-2.0
6
+ tags:
7
+ - text2text-generation
8
+ pipeline_tag: text2text-generation
9
+ language:
10
+ - zh
11
+ - en
12
+ ---
13
+
14
+ # GPTQ-for-Bloom
15
+
16
+ ## Welcome
17
+ If you find this model helpful, please *like* this model and star us on https://github.com/LianjiaTech/BELLE !
18
+
19
+ ## Model description
20
+ 8 bits quantization of [BELLE-7B-2M](https://huggingface.co/BelleGroup/BELLE-7B-2M) and [BELLE-7B-0.2M](https://huggingface.co/BelleGroup/BELLE-7B-0.2M) using [GPTQ](https://arxiv.org/abs/2210.17323)
21
+
22
+ GPTQ is SOTA one-shot weight quantization method.
23
+
24
+ The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/tree/main/gptq.
25
+
26
+ Basically, 8-bit quantization and 128 groupsize are recommended.
27
+
28
+ **This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa) for [Bloom](https://arxiv.org/pdf/2211.05100.pdf) model**
29
+
30
+ ## Model list
31
+
32
+ | model name | file size | GPU memory usage |
33
+ | -------------------------------------------------- | ------------------- | ------------------ |
34
+ | base | 27G | ~28.2G |
35
+ | bloom7b-2m-8bit-128g.pt | 9.7G | ~11.4G |
36
+ | bloom7b-2m-4bit-128g.pt | 6.9G | ~8.4G |
37
+ | bloom7b-0.2m-8bit-128g.pt | 9.7G | ~11.4G |
38
+ | bloom7b-0.2m-4bit-128g.pt | 6.9G | ~8.4G |
39
+
40
+ ## Limitations
41
+ There still exists a few issues in the model trained on current base model and data:
42
+
43
+ 1. The model might generate factual errors when asked to follow instructions related to facts.
44
+
45
+ 2. Occasionally generates harmful responses since the model still struggles to identify potential harmful instructions.
46
+
47
+ 3. Needs improvements on reasoning and coding.
48
+
49
+ Since the model still has its limitations, we require developers only use the open-sourced code, data, model and any other artifacts generated via this project for research purposes. Commercial use and other potential harmful use cases are not allowed.
50
+
51
+ ## Citation
52
+
53
+ Please cite us when using our code, data or model.
54
+
55
+ ```
56
+ @misc{BELLE,
57
+ author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
58
+ title = {BELLE: Bloom-Enhanced Large Language model Engine },
59
+ year = {2023},
60
+ publisher = {GitHub},
61
+ journal = {GitHub repository},
62
+ howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
63
+ }
64
+ ```
65
+
66
+ Cite the original BLOOM, Stanford Alpaca and Self-Instruct papers as well!
67
+
68
+ ***
69
+
70
+ # GPTQ-for-Bloom
71
+
72
+ ## 欢迎
73
+ 如果您觉得此模型对您有帮助,请like此模型并在https://github.com/LianjiaTech/BELLE 项目中star我们!
74
+
75
+ ## 模型描述
76
+ 对[BELLE-7B-2M](https://huggingface.co/BelleGroup/BELLE-7B-2M) and [BELLE-7B-0.2M](https://huggingface.co/BelleGroup/BELLE-7B-0.2M)进行8 bit(8位)量化。
77
+
78
+ GPTQ是目前SOTA的one-shot权重量化方法。
79
+
80
+ 此模型的推理代码请见https://github.com/LianjiaTech/BELLE/tree/main/gptq .
81
+
82
+ 一般来说,推荐使用8-bit量化及groupsize = 128.
83
+
84
+ **[Bloom](https://arxiv.org/pdf/2211.05100.pdf)模型使用[GPTQ](https://arxiv.org/abs/2210.17323)的推理代码基于[GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
85
+
86
+ ## 模型列表
87
+
88
+ | 模型名称 | 文件大小 | GPU显存占用 |
89
+ | -------------------------------------------------- | ------------------- | ------------------ |
90
+ | base | 27G | ~28.2G |
91
+ | bloom7b-2m-4bit-128g.pt | 5.0G | ~8.0G |
92
+
93
+
94
+ ## 局限性和使用限制
95
+ 基于当前数据和基础模型训练得到的SFT模型,在效果上仍存在以下问题:
96
+
97
+ 1. 在涉及事实性的指令上可能会产生违背事实的错误回答。
98
+
99
+ 2. 对于具备危害性的指令无法很好的鉴别,由此会产生危害性言论。
100
+
101
+ 3. 在一些涉及推理、代码等场景下模型的能力仍有待提高。
102
+
103
+ 基于以上模型局限性,我们要求开发者仅将我们开源的代码、数据、模型及后续用此项目生成的衍生物用于研究目的,不得用于商业,以及其他会对社会带来危害的用途。
104
+
105
+ ## 引用
106
+ 如果使用本项目的代码、数据或模型,请引用本项目。
107
+ ```
108
+ @misc{BELLE,
109
+ author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
110
+ title = {BELLE: Bloom-Enhanced Large Language model Engine },
111
+ year = {2023},
112
+ publisher = {GitHub},
113
+ journal = {GitHub repository},
114
+ howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
115
+ }
116
+ ```
117
+ 也请同时引用原始的BLOOM论文、Stanford Alpaca和Self-Instruct论文。