Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
tags:
|
@@ -8,6 +18,11 @@ language:
|
|
8 |
- en
|
9 |
---
|
10 |
# GPTQ-for-Bloom
|
|
|
|
|
|
|
|
|
|
|
11 |
8 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
|
12 |
|
13 |
GPTQ is SOTA one-shot weight quantization method.
|
@@ -28,3 +43,61 @@ Basically, 8-bit quantization and 128 groupsize are recommended.
|
|
28 |
| bloom7b-0.2m-8bit-128g.pt | 9.7G | ~11.4G |
|
29 |
| bloom7b-0.2m-4bit-128g.pt | 6.9G | ~8.4G |
|
30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- text2text-generation
|
5 |
+
pipeline_tag: text2text-generation
|
6 |
+
language:
|
7 |
+
- zh
|
8 |
+
- en
|
9 |
+
---
|
10 |
+
|
11 |
---
|
12 |
license: apache-2.0
|
13 |
tags:
|
|
|
18 |
- en
|
19 |
---
|
20 |
# GPTQ-for-Bloom
|
21 |
+
|
22 |
+
## Welcome
|
23 |
+
If you find this model helpful, please *like* this model and star us on https://github.com/LianjiaTech/BELLE !
|
24 |
+
|
25 |
+
## Model description
|
26 |
8 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
|
27 |
|
28 |
GPTQ is SOTA one-shot weight quantization method.
|
|
|
43 |
| bloom7b-0.2m-8bit-128g.pt | 9.7G | ~11.4G |
|
44 |
| bloom7b-0.2m-4bit-128g.pt | 6.9G | ~8.4G |
|
45 |
|
46 |
+
## Citation
|
47 |
+
|
48 |
+
Please cite us when using our code, data or model.
|
49 |
+
|
50 |
+
```
|
51 |
+
@misc{BELLE,
|
52 |
+
author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
|
53 |
+
title = {BELLE: Bloom-Enhanced Large Language model Engine },
|
54 |
+
year = {2023},
|
55 |
+
publisher = {GitHub},
|
56 |
+
journal = {GitHub repository},
|
57 |
+
howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
|
58 |
+
}
|
59 |
+
```
|
60 |
+
|
61 |
+
Cite the original BLOOM, Stanford Alpaca and Self-Instruct papers as well!
|
62 |
+
|
63 |
+
***
|
64 |
+
|
65 |
+
# GPTQ-for-Bloom
|
66 |
+
|
67 |
+
## 欢迎
|
68 |
+
如果您觉得此模型对您有帮助,请like此模型并在https://github.com/LianjiaTech/BELLE 项目中star我们!
|
69 |
+
|
70 |
+
## 模型描述
|
71 |
+
对[Bloom](https://arxiv.org/pdf/2211.05100.pdf)模型使用[GPTQ](https://arxiv.org/abs/2210.17323)进行8 bit(8位)量化。
|
72 |
+
|
73 |
+
GPTQ是目前SOTA的one-shot权重量化方法。
|
74 |
+
|
75 |
+
此模型的推理代码请见https://github.com/LianjiaTech/BELLE/gptq .
|
76 |
+
|
77 |
+
一般来说,推荐使用8-bit量化及groupsize = 128.
|
78 |
+
|
79 |
+
**推理代码基于[GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
|
80 |
+
|
81 |
+
## 模型列表
|
82 |
+
|
83 |
+
| 模型名称 | 文件大小 | GPU显存占用 |
|
84 |
+
| -------------------------------------------------- | ------------------- | ------------------ |
|
85 |
+
| base | 27G | ~28.2G |
|
86 |
+
| bloom7b-2m-8bit-128g.pt | 9.7G | ~11.4G |
|
87 |
+
| bloom7b-2m-4bit-128g.pt | 6.9G | ~8.4G |
|
88 |
+
| bloom7b-0.2m-8bit-128g.pt | 9.7G | ~11.4G |
|
89 |
+
| bloom7b-0.2m-4bit-128g.pt | 6.9G | ~8.4G |
|
90 |
+
|
91 |
+
## 引用
|
92 |
+
如果使用本项目的代码、数据或模型,请引用本项目。
|
93 |
+
```
|
94 |
+
@misc{BELLE,
|
95 |
+
author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
|
96 |
+
title = {BELLE: Bloom-Enhanced Large Language model Engine },
|
97 |
+
year = {2023},
|
98 |
+
publisher = {GitHub},
|
99 |
+
journal = {GitHub repository},
|
100 |
+
howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
|
101 |
+
}
|
102 |
+
```
|
103 |
+
也请同时引用原始的BLOOM论文、Stanford Alpaca和Self-Instruct论文。
|