Nexesenex commited on
Commit
7a8be7e
·
verified ·
1 Parent(s): bcbeae9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -13,6 +13,20 @@ Lowers quants (SOTA 2 bits) to come if I'm able to make an iMatrix on my config
13
 
14
  -----
15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  Benchs of the original Q4_K_S quant I found :
17
 
18
  Rope 8 10000
 
13
 
14
  -----
15
 
16
+ Edit : Due to a poor CPU (i7-6700k) for AI purpose, and only 36GB of VRAM, I remade Q3_K_S and Q2_K with an small iMatrix of ctx 32 with 25 chunks (so, 640 tokens), and it lowers the perplexity by :
17
+
18
+ More than 3% in Rope 8 on Q2_K
19
+
20
+ WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,6.2489,512,
21
+ WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-iMat-c32_ch25-Q2_K.gguf,-,wikitext,6.0482,512
22
+
23
+ More than 1% with Rope 8 on Q3_K_S
24
+
25
+ WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q3_K_S.gguf,-,wikitext,5.6127,512
26
+ WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-iMat-c32_ch25-Q3_K_S.gguf,-,wikitext,5.5461,512
27
+
28
+ -----
29
+
30
  Benchs of the original Q4_K_S quant I found :
31
 
32
  Rope 8 10000