--- license: apache-2.0 tags: - text2text-generation pipeline_tag: text2text-generation language: - zh - en --- --- license: apache-2.0 tags: - text2text-generation pipeline_tag: text2text-generation language: - zh - en --- # GPTQ-for-Bloom ## Welcome If you find this model helpful, please *like* this model and star us on https://github.com/LianjiaTech/BELLE ! ## Model description 8 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323) GPTQ is SOTA one-shot weight quantization method. The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq. Basically, 8-bit quantization and 128 groupsize are recommended. **This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)** ## Model list | model name | file size | GPU memory usage | | -------------------------------------------------- | ------------------- | ------------------ | | base | 27G | ~28.2G | | bloom7b-2m-8bit-128g.pt | 9.7G | ~11.4G | | bloom7b-2m-4bit-128g.pt | 6.9G | ~8.4G | | bloom7b-0.2m-8bit-128g.pt | 9.7G | ~11.4G | | bloom7b-0.2m-4bit-128g.pt | 6.9G | ~8.4G | ## Citation Please cite us when using our code, data or model. ``` @misc{BELLE, author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li}, title = {BELLE: Bloom-Enhanced Large Language model Engine }, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/LianjiaTech/BELLE}}, } ``` Cite the original BLOOM, Stanford Alpaca and Self-Instruct papers as well! *** # GPTQ-for-Bloom ## 欢迎 如果您觉得此模型对您有帮助,请like此模型并在https://github.com/LianjiaTech/BELLE 项目中star我们! ## 模型描述 对[Bloom](https://arxiv.org/pdf/2211.05100.pdf)模型使用[GPTQ](https://arxiv.org/abs/2210.17323)进行8 bit(8位)量化。 GPTQ是目前SOTA的one-shot权重量化方法。 此模型的推理代码请见https://github.com/LianjiaTech/BELLE/gptq . 一般来说,推荐使用8-bit量化及groupsize = 128. **推理代码基于[GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)** ## 模型列表 | 模型名称 | 文件大小 | GPU显存占用 | | -------------------------------------------------- | ------------------- | ------------------ | | base | 27G | ~28.2G | | bloom7b-2m-8bit-128g.pt | 9.7G | ~11.4G | | bloom7b-2m-4bit-128g.pt | 6.9G | ~8.4G | | bloom7b-0.2m-8bit-128g.pt | 9.7G | ~11.4G | | bloom7b-0.2m-4bit-128g.pt | 6.9G | ~8.4G | ## 引用 如果使用本项目的代码、数据或模型,请引用本项目。 ``` @misc{BELLE, author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li}, title = {BELLE: Bloom-Enhanced Large Language model Engine }, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/LianjiaTech/BELLE}}, } ``` 也请同时引用原始的BLOOM论文、Stanford Alpaca和Self-Instruct论文。