File size: 1,182 Bytes
852edc2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
license: other
license_name: llama3
tags:
- llama-3
- conversational
---
# OxxoCodes/Meta-Llama-3-8B-Instruct-GPTQ
*Built with Meta Llama 3*

Meta Llama 3 is licensed under the Meta Llama 3 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.

# Model Description
This is a 4-bit GPTQ quantized version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).

This model was quantized using the following quantization config:
```python
quantize_config = BaseQuantizeConfig(
    bits=4,
    group_size=128,
    desc_act=False,
    damp_percent=0.1,
)
```

To use this model, you need to install AutoGPTQ.
For detailed installation instructions, please refer to the [AutoGPTQ GitHub repository](https://github.com/AutoGPTQ/AutoGPTQ).

# Example Usage
```python
from auto_gptq import AutoGPTQForCausalLM

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
model = AutoGPTQForCausalLM.from_quantized("OxxoCodes/Meta-Llama-3-8B-Instruct-GPTQ")

output = model.generate(**tokenizer("The capitol of France is", return_tensors="pt").to(model.device))[0]
print(tokenizer.decode(output))
```