iproskurina commited on
Commit
864a441
1 Parent(s): 366a2c4

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: facebook/opt-125m
3
+ inference: false
4
+ model_creator: facebook
5
+ model_name: opt-125m
6
+ model_type: opt
7
+ pipeline_tag: text-generation
8
+ quantized_by: iproskurina
9
+ tags:
10
+ - pretrained
11
+ license: other
12
+ language:
13
+ - en
14
+ datasets:
15
+ - c4
16
+ ---
17
+
18
+
19
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/629a3dbcd496c6dcdebf41cc/t-6kpqFpEYJPT6zmvnm49.png" width="200" />
20
+
21
+ # OPT-125M-GPTQ
22
+
23
+
24
+ - Model creator: [Meta AI](https://huggingface.co/facebook)
25
+ - Original model: [OPT-125M](https://huggingface.co/facebook/opt-125m)
26
+
27
+ The model published in this repo was quantized to 4bit using [AutoGPTQ](https://github.com/PanQiWei/AutoGPTQ).
28
+
29
+ **Quantization details**
30
+
31
+ **All quantization parameters were taken from [GPTQ paper](https://arxiv.org/abs/2210.17323).**
32
+
33
+ GPTQ calibration data consisted of 128 random 2048 token segments from the [C4 dataset](https://huggingface.co/datasets/c4).
34
+
35
+ The grouping size used for quantization is equal to 128.
36
+
37
+ ## How to use this GPTQ model from Python code
38
+
39
+ ### Install the necessary packages
40
+
41
+ ```shell
42
+ pip install accelerate==0.26.1 datasets==2.16.1 dill==0.3.7 gekko==1.0.6 multiprocess==0.70.15 peft==0.7.1 rouge==1.0.1 sentencepiece==0.1.99
43
+ git clone https://github.com/upunaprosk/AutoGPTQ
44
+ cd AutoGPTQ
45
+ pip install -v .
46
+ ```
47
+ Recommended transformers version: 4.35.2.
48
+
49
+ ### You can then use the following code
50
+
51
+ ```python
52
+
53
+ from transformers import AutoTokenizer, TextGenerationPipeline,AutoModelForCausalLM
54
+ from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
55
+ pretrained_model_dir = "iproskurina/opt-125m-gptq-4bit"
56
+ tokenizer = AutoTokenizer.from_pretrained(pretrained_model_dir, use_fast=True)
57
+ model = AutoGPTQForCausalLM.from_quantized(pretrained_model_dir, device="cuda:0", model_basename="model")
58
+ pipeline = TextGenerationPipeline(model=model, tokenizer=tokenizer)
59
+ print(pipeline("auto-gptq is")[0]["generated_text"])
60
+ ```
61
+
62
+ [**LICENSE**](https://huggingface.co/facebook/opt-125m/blob/main/LICENSE.md)