Thireus commited on
Commit
919fbbf
·
1 Parent(s): 14c9442

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -16,7 +16,7 @@ quantized_by: Thireus
16
 
17
  ## Models available:
18
 
19
- | Link | BITS (-b) | HEAD BITS (-hb) | MEASU-REMENT LENGTH (-ml) | LENGTH (-l) | CAL DATASET (-c) | Size | V. | Max Context Length | Base Model | Layers | VRAM Min | VRAM Max | PPL** | Comments                                                                                                                         |
20
  | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ | ---- | ---- |------------------ | ------------------ | ------------------ | ---------------------------------------------------------------------------------- |
21
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-FP32-4.0bpw-h6-exl2/) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 33GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/c0dd3412d59c0bc776264512bf76264e954c221d) | 4096 | [FP32](https://huggingface.co/WizardLM/WizardLM-70B-V1.0) | 80 | 39GB | 44GB | 4.15234375 | Good results | | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 35GB | [0.0.1](https://github.com/turboderp/exllamav2/tree/aee7a281708d5faff2ad0ea4b3a3a4b754f458f3) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 40GB | 44GB | 4.1640625 | Model suffers from poor prompt understanding and logic is affected |
22
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16-4.0bpw-h6-exl2/) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 33GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/ec5164b8a8e282b91aedb2af94dfeb89887656b7) | 4096 | [BF16](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16) | 80 | 39GB | 44GB | 4.2421875 | Model suffers from poor prompt understanding and logic is affected |
@@ -32,6 +32,8 @@ quantized_by: Thireus
32
 
33
  \*\* Evaluated with text-generation-webui ExLlama v0.0.2 on wikitext-2-raw-v1 (stride 512 and max_length 0). For reference, [TheBloke_WizardLM-70B-V1.0-GPTQ_gptq-4bit-32g-actorder_True](https://huggingface.co/TheBloke/WizardLM-70B-V1.0-GPTQ/tree/gptq-4bit-32g-actorder_True) has a score of 4.1015625 in perplexity.
34
 
 
 
35
  ## Description:
36
 
37
  _This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)._
 
16
 
17
  ## Models available:
18
 
19
+ | Link | BITS (-b) | HEAD BITS (-hb) | MEASU-REMENT LENGTH (-ml) | LENGTH (-l) | CAL DATASET (-c) | Size | V. | Max Context Length | Base Model | Layers | VRAM Min*** | VRAM Max*** | PPL** | Comments                                                                                                                         |
20
  | ------ | --------- | --------------- | ------------------------ | ----------- | ---------------- | ---- | ------- | ------------------ | ---- | ---- |------------------ | ------------------ | ------------------ | ---------------------------------------------------------------------------------- |
21
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-FP32-4.0bpw-h6-exl2/) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 33GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/c0dd3412d59c0bc776264512bf76264e954c221d) | 4096 | [FP32](https://huggingface.co/WizardLM/WizardLM-70B-V1.0) | 80 | 39GB | 44GB | 4.15234375 | Good results | | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-HF-4.0bpw-h6-exl2/) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 35GB | [0.0.1](https://github.com/turboderp/exllamav2/tree/aee7a281708d5faff2ad0ea4b3a3a4b754f458f3) | 4096 | [FP16](https://huggingface.co/simsim314/WizardLM-70B-V1.0-HF) | 80 | 40GB | 44GB | 4.1640625 | Model suffers from poor prompt understanding and logic is affected |
22
  | [here](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16-4.0bpw-h6-exl2/) | 4.0 | 6 | 2048 | 2048 | [0000.parquet](https://huggingface.co/datasets/wikitext/tree/refs%2Fconvert%2Fparquet/wikitext-2-raw-v1/train)* | 33GB | [0.0.2](https://github.com/turboderp/exllamav2/tree/ec5164b8a8e282b91aedb2af94dfeb89887656b7) | 4096 | [BF16](https://huggingface.co/Thireus/WizardLM-70B-V1.0-BF16) | 80 | 39GB | 44GB | 4.2421875 | Model suffers from poor prompt understanding and logic is affected |
 
32
 
33
  \*\* Evaluated with text-generation-webui ExLlama v0.0.2 on wikitext-2-raw-v1 (stride 512 and max_length 0). For reference, [TheBloke_WizardLM-70B-V1.0-GPTQ_gptq-4bit-32g-actorder_True](https://huggingface.co/TheBloke/WizardLM-70B-V1.0-GPTQ/tree/gptq-4bit-32g-actorder_True) has a score of 4.1015625 in perplexity.
34
 
35
+ \*\*\* Without Flash Attention - For VRAM optimisation, make sure you install https://github.com/Dao-AILab/flash-attention#installation-and-features
36
+
37
  ## Description:
38
 
39
  _This repository contains EXL2 model files for [WizardLM's WizardLM 70B V1.0](https://huggingface.co/WizardLM/WizardLM-70B-V1.0)._