Squish42
/

WizardLM-7B-Uncensored-GPTQ-8bit-128g

Text Generation

Inference Endpoints

Model card Files Files and versions Community

WizardLM-7B-Uncensored-GPTQ-8bit-128g / README.md

Squish42's picture

Initial commit

be6f120 over 1 year ago

|

history blame contribute delete

959 Bytes

	---
	license: unknown
	---

	[ehartford/WizardLM-7B-Uncensored](https://huggingface.co/ehartford/WizardLM-7B-Uncensored) quantized to 8bit GPTQ with group size 128 + true sequential, no act order.

	For most uses this probably isn't what you want. \
	For 4bit GPTQ quantizations see [TheBloke/WizardLM-7B-uncensored-GPTQ](https://huggingface.co/TheBloke/WizardLM-7B-uncensored-GPTQ)

	Quantized using AutoGPTQ with the following config:
	```python
	config: dict = dict(
	quantize_config=dict(model_file_base_name='WizardLM-7B-Uncensored',
	bits=8, desc_act=False, group_size=128, true_sequential=True),
	use_safetensors=True
	)
	```
	See `quantize.py` for the full script.

	Tested for compatibility with:
	- WSL with GPTQ-for-Llama `triton` branch.

	AutoGPTQ loader should read configuration from `quantize_config.json`.\
	For GPTQ-for-Llama use the following configuration when loading:\
	wbits: 8\
	groupsize: 128\
	model_type: llama