Panchovix commited on
Commit
a6998a1
1 Parent(s): f86143e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -0
README.md CHANGED
@@ -1,3 +1,16 @@
1
  ---
2
  license: other
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: other
3
  ---
4
+ [GPlatty-30B](https://huggingface.co/lilloukas/GPlatty-30B) merged with bhenrym14's [airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA](https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA), quantized at 4 bit.
5
+
6
+ More info about the LoRA [Here](https://huggingface.co/bhenrym14/airoboros-33b-gpt4-1.4.1-lxctx-PI-16384-LoRA). This is an alternative to SuperHOT 8k LoRA trained with LoRA_rank 64 and context extended to 16K, with airoboros 1.4.1 dataset.
7
+
8
+ It was created with GPTQ-for-LLaMA with group size 32 and act order true as parameters, to get the maximum perplexity vs FP16 model.
9
+
10
+ I HIGHLY suggest to use exllama, to evade some VRAM issues.
11
+
12
+ Use compress_pos_emb = 8 for any context up to 16384 context.
13
+
14
+ If you have 2x24 GB VRAM GPUs cards, to not get Out of Memory errors at 16384 context, use:
15
+
16
+ gpu_split: 8.4,9.6