Panchovix commited on
Commit
fde1368
1 Parent(s): 3cc7678

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -1,3 +1,14 @@
1
  ---
2
  license: other
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: other
3
  ---
4
+ [WizardLM-Uncensored-SuperCOT-StoryTelling-30b](https://huggingface.co/Monero/WizardLM-Uncensored-SuperCOT-StoryTelling-30b) merged with kaiokendev's [33b SuperHOT 8k LoRA](https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test), quantized at 4 bit.
5
+
6
+ It was created with GPTQ-for-LLaMA with group size 32 and act order true as parameters, to get the maximum perplexity vs FP16 model.
7
+
8
+ I HIGHLY suggest to use exllama, to evade some VRAM issues.
9
+
10
+ Use (max_seq_len = context):
11
+
12
+ If max_seq_len = 4096, compress_pos_emb = 2
13
+
14
+ If max_seq_len = 8192, compress_pos_emb = 4