TehVenom commited on
Commit
f5924cb
1 Parent(s): dc6b593

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -26,9 +26,15 @@ Quantization was done using https://github.com/oobabooga/GPTQ-for-LLaMa for use
26
 
27
  Via the following command:
28
  ```
29
- python llama.py ./TehVenom_Pygmalion-7b-Merged-Safetensors c4 --wbits 4 --true-sequential --groupsize 32 --save_safetensors Pygmalion-7B-GPTQ-4bit-32g.no-act-order.safetensors
30
  ```
31
 
 
 
 
 
 
 
32
  ## Prompting
33
 
34
  The model was trained on the usual Pygmalion persona + chat format, so any of the usual UIs should already handle everything correctly. If you're using the model directly, this is the expected formatting:
 
26
 
27
  Via the following command:
28
  ```
29
+ python llama.py ./TehVenom_Pygmalion-7b-Merged-Safetensors c4 --wbits 4 --act-order --save_safetensors Pygmalion-7B-GPTQ-4bit.act-order.safetensors
30
  ```
31
 
32
+ This is the best eval i could get after trying many argument combinations, by converting the model from bf16 to fp32, before quantizing down to 4bit with --act-order as the sole argument.
33
+
34
+ - Wikitext 2: 6.2477378845215
35
+ - PTB-New: 46.5129699707031
36
+ - C4-New: 7.8470954895020
37
+
38
  ## Prompting
39
 
40
  The model was trained on the usual Pygmalion persona + chat format, so any of the usual UIs should already handle everything correctly. If you're using the model directly, this is the expected formatting: