File size: 3,988 Bytes
40b2d72 38ec656 40b2d72 38ec656 40b2d72 38ec656 5144fcd 38ec656 2827354 38ec656 9d90c75 6bc41d9 6072ce0 24a748f fad41ca 38ec656 9dcf1ce 38ec656 efd5fb2 38ec656 9d90c75 38ec656 585688f 38ec656 e5e5e5f 38ec656 9dcf1ce 38ec656 7951b59 38ec656 9dcf1ce 38ec656 71ec21e 7b81965 71ec21e 38ec656 511017b 9d90c75 38ec656 87db4c3 38ec656 9d90c75 71ec21e 9d90c75 68d241f 5cfe9bb 71ec21e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
---
inference: false
license: llama2
model_creator: WizardLM
model_link: https://huggingface.co/WizardLM/WizardLM-70B-V1.0
model_name: WizardLM 70B V1.0
model_type: llama
quantized_by: Thireus
---
# WizardLM 70B V1.0 β EXL2
- Model creator: [WizardLM](https://huggingface.co/WizardLM)
- Original model: [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)
- Model used for quantization: [WizardLM 70B V1.0-HF](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) β float16 of [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)
## Models available in this repository
| Branch | BITS (-b) | HEAD BITS (-hb) | MEASUREMENT LENGTH (-ml) | LENGTH (-l) | CAL DATASET (-c) | Size | ExLlama | Max Context Length |
| ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ |
| [main](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/tree/main) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 35GB | [v2](https://github.com/turboderp/exllamav2) | 4096 |
| _coming soon..._ | 5.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | ...GB | [v2](https://github.com/turboderp/exllamav2) | 4096 |
| _coming soon..._ | 6.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | ...GB | [v2](https://github.com/turboderp/exllamav2) | 4096 |
\* wikitext-2-raw-v1
## Description:
_This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)._
EXL2 is a new format used by ExLlamaV2 β https://github.com/turboderp/exllamav2. EXL2 is based on the same optimization method as GPTQ. The format allows for mixing quantization
levels within a model to achieve any average bitrate between 2 and 8 bits per weight.
## Prompt template (official):
```
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:
```
## Prompt template (suggested):
```
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions.
USER:
{prompt}
ASSISTANT:
```
## Quantization process:
| Original Model | β | Float16 Model* | β | Safetensor Model** | β | EXL2 Model |
| -------------- | --- | ------------- | --- | ---------------- | --- | ---------- |
| [WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0) | β | [WizardLM 70B V1.0-HF](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF)* | β | Safetensor** | β | EXL2 |
Example to convert WizardLM-70B-V1.0-HF_float16_safetensored to EXL2 4.0 bpw with 6-bit head:
```
mkdir -p ~/EXL2/WizardLM-70B-V1.0-HF_4bit # Create the output directory
python convert.py -i ~/float16_safetensored/WizardLM-70B-V1.0-HF -o ~/EXL2/WizardLM-70B-V1.0-HF_4bit -c ~/EXL2/0000.parquet -b 4.0 -hb 6
```
\* Use the following script to convert your local pytorch_model bin files to float16 (you can also choose bfloat16) + safetensors all in one go:
- https://github.com/oobabooga/text-generation-webui/blob/main/convert-to-safetensors.py
(best for sharding and float16/FP16 or bfloat16/BF16 conversion)
\*\* Use any one of the following scripts to convert your local pytorch_model bin files to safetensors:
- https://github.com/turboderp/exllamav2/blob/master/util/convert_safetensors.py (official ExLlamaV2)
- https://huggingface.co/Panchovix/airoboros-l2-70b-gpt4-1.4.1-safetensors/blob/main/bin2safetensors/convert.py (recommended if model already converted to float16)
- https://gist.github.com/epicfilemcnulty/1f55fd96b08f8d4d6693293e37b4c55e#file-2safetensors-py |