BELLE-7B-gptq / README.md
barius's picture
Update README.md
198cc00
|
raw
history blame
3.56 kB
metadata
license: apache-2.0
tags:
  - text2text-generation
pipeline_tag: text2text-generation
language:
  - zh
  - en

GPTQ-for-Bloom

Welcome

If you find this model helpful, please like this model and star us on https://github.com/LianjiaTech/BELLE !

Model description

8 bits quantization of Bloom using GPTQ

GPTQ is SOTA one-shot weight quantization method.

The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.

Basically, 8-bit quantization and 128 groupsize are recommended.

This code is based on GPTQ-for-LLaMa

Model list

model name file size GPU memory usage
base 27G ~28.2G
bloom7b-2m-8bit-128g.pt 9.7G ~11.4G
bloom7b-2m-4bit-128g.pt 6.9G ~8.4G
bloom7b-0.2m-8bit-128g.pt 9.7G ~11.4G
bloom7b-0.2m-4bit-128g.pt 6.9G ~8.4G

Citation

Please cite us when using our code, data or model.

@misc{BELLE,
  author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
  title = {BELLE: Bloom-Enhanced Large Language model Engine },
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
}

Cite the original BLOOM, Stanford Alpaca and Self-Instruct papers as well!


GPTQ-for-Bloom

欢迎

如果您觉得此模型对您有帮助,请like此模型并在https://github.com/LianjiaTech/BELLE 项目中star我们!

模型描述

Bloom模型使用GPTQ进行8 bit(8位)量化。

GPTQ是目前SOTA的one-shot权重量化方法。

此模型的推理代码请见https://github.com/LianjiaTech/BELLE/gptq .

一般来说,推荐使用8-bit量化及groupsize = 128.

推理代码基于GPTQ-for-LLaMa

模型列表

模型名称 文件大小 GPU显存占用
base 27G ~28.2G
bloom7b-2m-8bit-128g.pt 9.7G ~11.4G
bloom7b-2m-4bit-128g.pt 6.9G ~8.4G
bloom7b-0.2m-8bit-128g.pt 9.7G ~11.4G
bloom7b-0.2m-4bit-128g.pt 6.9G ~8.4G

引用

如果使用本项目的代码、数据或模型,请引用本项目。

@misc{BELLE,
  author = {Yunjie Ji, Yong Deng, Yan Gong, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li},
  title = {BELLE: Bloom-Enhanced Large Language model Engine },
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
}

也请同时引用原始的BLOOM论文、Stanford Alpaca和Self-Instruct论文。