Squish42's picture
README formatting
835febd
---
license: unknown
---
[ehartford/WizardLM-7B-Uncensored](https://huggingface.co/ehartford/WizardLM-7B-Uncensored) quantized to **8bit GPTQ** with act order + true sequential, no group size.
*For most uses this probably isn't what you want.* \
*For 4bit with no act order or compatibility with `old-cuda` (text-generation-webui default) see [TheBloke/WizardLM-7B-uncensored-GPTQ](https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GPTQ)*
Quantized using AutoGPTQ with the following config:
```python
config: dict = dict(
quantize_config=dict(bits=8, desc_act=True, true_sequential=True, model_file_base_name='WizardLM-7B-Uncensored'),
use_safetensors=True
)
```
See `quantize.py` for the full script.
Tested for compatibility with:
- WSL with GPTQ-for-Llama `triton` branch.
- Windows with AutoGPTQ on `cuda` (triton deselected)
AutoGPTQ loader should read configuration from `quantize_config.json`.\
For GPTQ-for-Llama use the following configuration when loading:\
wbits: 8\
groupsize: None\
model_type: llama