metadata
inference: false
license: llama2
model_creator: WizardLM
model_link: https://huggingface.co/WizardLM/WizardLM-70B-V1.0
model_name: WizardLM 70B V1.0
model_type: llama
quantized_by: Thireus
WizardLM 70B V1.0 - EXL2
- Model creator: WizardLM
- Original model: WizardLM 70B V1.0
- Quantized model: WizardLM 70B V1.0-HF β float16 of WizardLM 70B V1.0
Models available in this repository
Branch | BITS (-b) | HEAD BITS (-hb) | MEASUREMENT LENGTH (-ml) | LENGTH (-l) | CAL DATASET (-c) | Size | ExLlama | Max Context Length |
---|---|---|---|---|---|---|---|---|
main | 4.0 | 6 | 2048 | 2048 | 0000.parquet (wikitext-2-raw-v1) | 33GB | V2 | 4096 |
Description:
This repository contains EXL2 model files for WizardLM's WizardLM 70B V1.0.
EXL2 is a new format used by ExLlamaV2 β https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
Prompt template (official) β Vicuna:
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
Prompt template (suggested):
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
USER:
{prompt}
ASSISTANT:
Quantization process:
Original Model | β | Float16 Model | β | Safetensor Model | β | EXL2 Model |
---|---|---|---|---|---|---|
WizardLM 70B V1.0 | β | WizardLM 70B V1.0-HF | β | Safetensor* | β | EXL2 |
Example to convert WizardLM-70B-V1.0-HF_float16_safetensored to EXL2 4.0 bpw with 6-bit head:
mkdir -p ~/EXL2/WizardLM-70B-V1.0-HF_4bit # Create the output directory
python convert.py -i ~/safetensor/WizardLM-70B-V1.0-HF_float16_safetensored -o ~/EXL2/WizardLM-70B-V1.0-HF_4bit -c ~/EXL2/0000.parquet -b 4.0 -hb 6
(*) Use any one of the following scripts to convert your float16 pytorch_model bin files to safetensors:
- https://github.com/turboderp/exllamav2/blob/master/util/convert_safetensors.py
- https://huggingface.co/Panchovix/airoboros-l2-70b-gpt4-1.4.1-safetensors/blob/main/bin2safetensors/convert.py
- https://gist.github.com/epicfilemcnulty/1f55fd96b08f8d4d6693293e37b4c55e
- https://github.com/oobabooga/text-generation-webui/blob/main/convert-to-safetensors.py