Squish42/WizardLM-7B-Uncensored-GPTQ-8bit-128g

ehartford/WizardLM-7B-Uncensored quantized to 8bit GPTQ with group size 128 + true sequential, no act order.

For most uses this probably isn't what you want.
For 4bit GPTQ quantizations see TheBloke/WizardLM-7B-uncensored-GPTQ

Quantized using AutoGPTQ with the following config:

config: dict = dict(
    quantize_config=dict(model_file_base_name='WizardLM-7B-Uncensored',
                         bits=8, desc_act=False, group_size=128, true_sequential=True),
    use_safetensors=True
)

See quantize.py for the full script.

Tested for compatibility with:

WSL with GPTQ-for-Llama triton branch.

AutoGPTQ loader should read configuration from quantize_config.json.
For GPTQ-for-Llama use the following configuration when loading:
wbits: 8
groupsize: 128
model_type: llama