latimar commited on
Commit
cee8ea0
1 Parent(s): 68e49f9

Update README

Browse files
Files changed (1) hide show
  1. README.md +16 -11
README.md CHANGED
@@ -26,6 +26,7 @@ There are the following branches:
26
  5_0-bpw-h8
27
  5_0-bpw-h8-evol-ins
28
  4_625-bpw-h6
 
29
  4_125-bpw-h6
30
  3_8-bpw-h6
31
  2_75-bpw-h6
@@ -36,17 +37,21 @@ There are the following branches:
36
  * Evaluation dataset used to calculate perplexity: [wikitext-v2](https://huggingface.co/datasets/wikitext/blob/refs%2Fconvert%2Fparquet/wikitext-2-v1/validation/0000.parquet)
37
  * Calibration dataset used for conversion of `5_0-bpw-h8-evol-ins`: [wizardLM-evol-instruct_70k](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_70k/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet)
38
  * Evaluation dataset used to calculate ppl for `Evol-Ins`: : [nikrosh-evol-instruct](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet)
39
- * PPL max seq. length used: 1792 (2048 with 5.0-bpw-h8 causes OOM on RTX 4090 when evaluating ppl, so had to go down a bit)
40
-
 
 
 
 
 
41
 
42
  | BPW | PPL on Wiki | PPL on Evol-Ins | File Size (Gb) |
43
  | ----------- | ----------- | --------------- | -------------- |
44
- | 2.55-h6 | 15.0901 | | 10.56 |
45
- | 2.75-h6 | 13.6153 | | 11.33 |
46
- | 3.8-h6 | 6.8803 | | 15.37 |
47
- | 4.125-h6 | 6.8095 | | 16.65 |
48
- | 4.625-h6 | 6.7992 | 2.0499 | 18.58 |
49
- | 5.0-h8 | 6.7785 | 2.0448 | 20.09 |
50
- | 5.0-h8-ev | 6.9376 | 2.0430 | 20.09 |
51
-
52
-
 
26
  5_0-bpw-h8
27
  5_0-bpw-h8-evol-ins
28
  4_625-bpw-h6
29
+ 4_4-bpw-h8
30
  4_125-bpw-h6
31
  3_8-bpw-h6
32
  2_75-bpw-h6
 
37
  * Evaluation dataset used to calculate perplexity: [wikitext-v2](https://huggingface.co/datasets/wikitext/blob/refs%2Fconvert%2Fparquet/wikitext-2-v1/validation/0000.parquet)
38
  * Calibration dataset used for conversion of `5_0-bpw-h8-evol-ins`: [wizardLM-evol-instruct_70k](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_70k/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet)
39
  * Evaluation dataset used to calculate ppl for `Evol-Ins`: : [nikrosh-evol-instruct](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet)
40
+ * When converting `4_4-bpw-h8` quant, additional `-mr 32` arg was used.
41
+
42
+ PPL was measured with the [test_inference.py exllamav2 script](https://github.com/turboderp/exllamav2/blob/master/test_inference.py):
43
+
44
+ ```
45
+ python test_inference.py -m /storage/models/LLaMA/EXL2/Phind-Codellama-34B-v2 -ed /storage/datasets/text/evol-instruct/nickrosh-evol-instruct-code-80k.parquet
46
+ ```
47
 
48
  | BPW | PPL on Wiki | PPL on Evol-Ins | File Size (Gb) |
49
  | ----------- | ----------- | --------------- | -------------- |
50
+ | 2.55-h6 | 11.0310 | 2.4542 | 10.56 |
51
+ | 2.75-h6 | 9.7902 | 2.2888 | 11.33 |
52
+ | 3.8-h6 | 6.7293 | 2.0724 | 15.37 |
53
+ | 4.125-h6 | 6.6713 | 2.0617 | 16.65 |
54
+ | 4.4-h8 | 6.6487 | 2.0509 | 17.76 |
55
+ | 4.625-h6 | 6.6576 | 2.0459 | 18.58 |
56
+ | 5.0-h8 | 6.6379 | 2.0419 | 20.09 |
57
+ | 5.0-h8-ev | 6.7785 | 2.0445 | 20.09 |