Panchovix commited on
Commit
70a3c3d
1 Parent(s): a99f419

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -5
README.md CHANGED
@@ -7,11 +7,7 @@ It was created with GPTQ-for-LLaMA with group size 32 and act order true as para
7
 
8
  I HIGHLY suggest to use exllama, to evade some VRAM issues.
9
 
10
- Use (max_seq_len = context):
11
-
12
- If max_seq_len = 4096, compress_pos_emb = 2
13
-
14
- If max_seq_len = 8192, compress_pos_emb = 4
15
 
16
  If you have 2x24 GB VRAM GPUs cards, to not get Out of Memory errors at 8192 context, use:
17
 
 
7
 
8
  I HIGHLY suggest to use exllama, to evade some VRAM issues.
9
 
10
+ Use compress_pos_emb = 4 for any context up to 8192 context.
 
 
 
 
11
 
12
  If you have 2x24 GB VRAM GPUs cards, to not get Out of Memory errors at 8192 context, use:
13