File size: 959 Bytes
be6f120
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
---
license: unknown
---

[ehartford/WizardLM-7B-Uncensored](https://huggingface.co/ehartford/WizardLM-7B-Uncensored) quantized to **8bit GPTQ** with group size 128 + true sequential, no act order.

*For most uses this probably isn't what you want.* \
*For 4bit GPTQ quantizations see [TheBloke/WizardLM-7B-uncensored-GPTQ](https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GPTQ)*

Quantized using AutoGPTQ with the following config:
```python
config: dict = dict(
    quantize_config=dict(model_file_base_name='WizardLM-7B-Uncensored',
                         bits=8, desc_act=False, group_size=128, true_sequential=True),
    use_safetensors=True
)
```
See `quantize.py` for the full script.

Tested for compatibility with:
- WSL with GPTQ-for-Llama `triton` branch.

AutoGPTQ loader should read configuration from `quantize_config.json`.\
For GPTQ-for-Llama use the following configuration when loading:\
wbits: 8\
groupsize: 128\
model_type: llama