bnjmnmarie commited on
Commit
d994b72
1 Parent(s): b221737

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -1
README.md CHANGED
@@ -1,3 +1,18 @@
1
  ---
2
  license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ ---
4
+
5
+ Llama 2 7B quantized in 3-bit with GPTQ.
6
+
7
+ ```
8
+ from transformers import AutoModelForCausalLM, AutoTokenizer
9
+ from optimum.gptq import GPTQQuantizer
10
+ import torch
11
+ w = 3
12
+ model_path = meta-llama/Llama-2-7b-hf
13
+
14
+ tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True)
15
+ model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16)
16
+ quantizer = GPTQQuantizer(bits=w, dataset="c4", model_seqlen = 4096)
17
+ quantized_model = quantizer.quantize_model(model, tokenizer)
18
+ ```