latimar
/

Phind-Codellama-34B-v2-exl2

Text Generation

Transformers

llama

text-generation-inference

Model card Files Files and versions Community

latimar commited on Sep 18, 2023

Commit

cee8ea0

•

1 Parent(s): 68e49f9

Update README

Browse files

Files changed (1) hide show

README.md +16 -11

README.md CHANGED Viewed

@@ -26,6 +26,7 @@ There are the following branches:
 5_0-bpw-h8
 5_0-bpw-h8-evol-ins
 4_625-bpw-h6
 4_125-bpw-h6
 3_8-bpw-h6
 2_75-bpw-h6
@@ -36,17 +37,21 @@ There are the following branches:
 * Evaluation dataset used to calculate perplexity: [wikitext-v2](https://huggingface.co/datasets/wikitext/blob/refs%2Fconvert%2Fparquet/wikitext-2-v1/validation/0000.parquet)
 * Calibration dataset used for conversion of `5_0-bpw-h8-evol-ins`: [wizardLM-evol-instruct_70k](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_70k/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet)
 * Evaluation dataset used to calculate ppl for `Evol-Ins`: : [nikrosh-evol-instruct](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet)
-* PPL max seq. length used: 1792 (2048 with 5.0-bpw-h8 causes OOM on RTX 4090 when evaluating ppl, so had to go down a bit)
 | BPW         | PPL on Wiki | PPL on Evol-Ins | File Size (Gb) |
 | ----------- | ----------- | --------------- | -------------- |
-|  2.55-h6    |  15.0901    |                 |   10.56        |
-|  2.75-h6    |  13.6153    |                 |   11.33        |
-|  3.8-h6     |  6.8803     |                 |   15.37        |
-|  4.125-h6   |  6.8095     |                 |   16.65        |
-|  4.625-h6   |  6.7992     |     2.0499      |   18.58        |
-|  5.0-h8     |  6.7785     |     2.0448      |   20.09        |
-|  5.0-h8-ev  |  6.9376     |     2.0430      |   20.09        |

 5_0-bpw-h8
 5_0-bpw-h8-evol-ins
 4_625-bpw-h6
+4_4-bpw-h8
 4_125-bpw-h6
 3_8-bpw-h6
 2_75-bpw-h6
 * Evaluation dataset used to calculate perplexity: [wikitext-v2](https://huggingface.co/datasets/wikitext/blob/refs%2Fconvert%2Fparquet/wikitext-2-v1/validation/0000.parquet)
 * Calibration dataset used for conversion of `5_0-bpw-h8-evol-ins`: [wizardLM-evol-instruct_70k](https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_70k/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet)
 * Evaluation dataset used to calculate ppl for `Evol-Ins`: : [nikrosh-evol-instruct](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1/blob/refs%2Fconvert%2Fparquet/default/train/0000.parquet)
+* When converting `4_4-bpw-h8` quant, additional `-mr 32` arg was used.
+PPL was measured with the [test_inference.py exllamav2 script](https://github.com/turboderp/exllamav2/blob/master/test_inference.py):
+```
+python test_inference.py -m /storage/models/LLaMA/EXL2/Phind-Codellama-34B-v2 -ed /storage/datasets/text/evol-instruct/nickrosh-evol-instruct-code-80k.parquet
+```
 | BPW         | PPL on Wiki | PPL on Evol-Ins | File Size (Gb) |
 | ----------- | ----------- | --------------- | -------------- |
+|  2.55-h6    |  11.0310    |     2.4542      |   10.56        |
+|  2.75-h6    |  9.7902     |     2.2888      |   11.33        |
+|  3.8-h6     |  6.7293     |     2.0724      |   15.37        |
+|  4.125-h6   |  6.6713     |     2.0617      |   16.65        |
+|  4.4-h8     |  6.6487     |     2.0509      |   17.76        |
+|  4.625-h6   |  6.6576     |     2.0459      |   18.58        |
+|  5.0-h8     |  6.6379     |     2.0419      |   20.09        |
+|  5.0-h8-ev  |  6.7785     |     2.0445      |   20.09        |