Nexesenex commited on
Commit
4c779d7
·
verified ·
1 Parent(s): 0a94ba0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -2
README.md CHANGED
@@ -15,7 +15,8 @@ And a bonus to play with it, my KoboldCPP_-_v1.55.1.b1933_-_Frankenstein from th
15
 
16
  -----
17
 
18
- Edit : Due to a poor CPU (i7-6700k) for AI purpose, and only 36GB of VRAM, I remade Q3_K_S and Q2_K with an small iMatrix of ctx 32 with 25 chunks (so, 640 tokens), and it lowers the perplexity by :
 
19
 
20
  More than 3% in Rope 8 on Q2_K
21
 
@@ -37,6 +38,10 @@ More than 1% with Rope 8 on Q3_K_S
37
  WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q3_K_S.gguf,-,wikitext,5.6127,512
38
  WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-iMat-c32_ch25-Q3_K_S.gguf,-,wikitext,5.5461,512
39
 
 
 
 
 
40
  Interestingly, Rope 2.5 is almost without loss compared to rope 2, while 3 and 3.2 are quite good. Here are the values with the normal Q2_K :
41
 
42
  Rope 2.5 (max context 10240) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,4.5246,512
@@ -45,7 +50,8 @@ Rope 3 (max context 12288) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b
45
 
46
  Rope 3.2 (max context 13107) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,4.6679,512
47
 
48
- And for the adventurous, Rope 10 : (max context 40960) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,7.1577,512
 
49
 
50
  So the linear rope, at least on this model, is flexible, and you can lower it to have the best peplexity for your max context.
51
 
 
15
 
16
  -----
17
 
18
+ Edit : Due to a poor CPU (i7-6700k) for AI purpose, and only 36GB of VRAM, I remade Q3_K_S and Q2_K with an small iMatrix of ctx 32 with 25 chunks (so, 640 tokens).
19
+ And good news, it lowers the perplexity by :
20
 
21
  More than 3% in Rope 8 on Q2_K
22
 
 
38
  WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q3_K_S.gguf,-,wikitext,5.6127,512
39
  WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-iMat-c32_ch25-Q3_K_S.gguf,-,wikitext,5.5461,512
40
 
41
+ A Q3_K_M with iMatrix has been added as well.
42
+
43
+ -----
44
+
45
  Interestingly, Rope 2.5 is almost without loss compared to rope 2, while 3 and 3.2 are quite good. Here are the values with the normal Q2_K :
46
 
47
  Rope 2.5 (max context 10240) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,4.5246,512
 
50
 
51
  Rope 3.2 (max context 13107) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,4.6679,512
52
 
53
+ And for the adventurous, Rope 10 : (max context 40960) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,7.1577,512
54
+ - Minus 3% With my Q2_K with c32ch25 iMatrix : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-iMat-c32_ch25-Q2_K.gguf,-,wikitext,6.9405,512
55
 
56
  So the linear rope, at least on this model, is flexible, and you can lower it to have the best peplexity for your max context.
57