OxxoCodes commited on
Commit
852edc2
1 Parent(s): dda0b64

Add model card

Browse files
Files changed (1) hide show
  1. README.md +38 -0
README.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: llama3
4
+ tags:
5
+ - llama-3
6
+ - conversational
7
+ ---
8
+ # OxxoCodes/Meta-Llama-3-8B-Instruct-GPTQ
9
+ *Built with Meta Llama 3*
10
+
11
+ Meta Llama 3 is licensed under the Meta Llama 3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
12
+
13
+ # Model Description
14
+ This is a 4-bit GPTQ quantized version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
15
+
16
+ This model was quantized using the following quantization config:
17
+ ```python
18
+ quantize_config = BaseQuantizeConfig(
19
+ bits=4,
20
+ group_size=128,
21
+ desc_act=False,
22
+ damp_percent=0.1,
23
+ )
24
+ ```
25
+
26
+ To use this model, you need to install AutoGPTQ.
27
+ For detailed installation instructions, please refer to the [AutoGPTQ GitHub repository](https://github.com/AutoGPTQ/AutoGPTQ).
28
+
29
+ # Example Usage
30
+ ```python
31
+ from auto_gptq import AutoGPTQForCausalLM
32
+
33
+ tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
34
+ model = AutoGPTQForCausalLM.from_quantized("OxxoCodes/Meta-Llama-3-8B-Instruct-GPTQ")
35
+
36
+ output = model.generate(**tokenizer("The capitol of France is", return_tensors="pt").to(model.device))[0]
37
+ print(tokenizer.decode(output))
38
+ ```