BelleGroup
/

BELLE-7B-gptq

Text2Text Generation

feature-extraction

Inference Endpoints

Model card Files Files and versions Community

mabaochang commited on Mar 25, 2023

Commit

7e11068

•

1 Parent(s): 4171511

Update README.md

Files changed (1) hide show

README.md +17 -0

README.md CHANGED Viewed

@@ -1,3 +1,20 @@
 ---
 license: apache-2.0
 ---

 ---
 license: apache-2.0
 ---
+# GPTQ-for-Bloom
+4 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)
+GPTQ is SOTA one-shot weight quantization method.
+The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.
+**This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**
+## Model list
+| model name       |  file size | GPU memory |
+| -------------------------------------------------- |  ------------------- | ------------------ |
+|           bloom7b-2m-8bit-128g.pt                  |          9.7G        |       11G          |
+|           bloom7b-2m-4bit-128g.pt                  |          6.9G        |        8G          |
+|           bloom7b-2m-3bit-128g.pt                  |          6.2G        |        7.7G        |