File size: 961 Bytes
4171511
 
4060faa
 
 
 
 
 
4171511
7e11068
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
---
license: apache-2.0
tags:
- text2text-generation
pipeline_tag: text2text-generation
language:
- zh
- en
---
# GPTQ-for-Bloom
4 bits quantization of [Bloom](https://arxiv.org/pdf/2211.05100.pdf) using [GPTQ](https://arxiv.org/abs/2210.17323)

GPTQ is SOTA one-shot weight quantization method.

The code of inference can be found in our Github project repository: https://github.com/LianjiaTech/BELLE/gptq.

**This code is based on [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa)**

## Model list

| model name       |  file size | GPU memory |
| -------------------------------------------------- |  ------------------- | ------------------ |
|           bloom7b-2m-8bit-128g.pt                  |          9.7G        |       11G          |
|           bloom7b-2m-4bit-128g.pt                  |          6.9G        |        8G          |
|           bloom7b-2m-3bit-128g.pt                  |          6.2G        |        7.7G        |