Nexesenex commited on
Commit
0a94ba0
1 Parent(s): 664479a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -45,9 +45,15 @@ Rope 3 (max context 12288) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b
45
 
46
  Rope 3.2 (max context 13107) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,4.6679,512
47
 
 
48
 
49
  So the linear rope, at least on this model, is flexible, and you can lower it to have the best peplexity for your max context.
50
 
 
 
 
 
 
51
  -----
52
 
53
  Benchs of the original Q4_K_S quant I found :
 
45
 
46
  Rope 3.2 (max context 13107) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,4.6679,512
47
 
48
+ And for the adventurous, Rope 10 : (max context 40960) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,7.1577,512
49
 
50
  So the linear rope, at least on this model, is flexible, and you can lower it to have the best peplexity for your max context.
51
 
52
+ Then, I wonder about applying a NTK rope on the top of it to expend it further, even if it screws with the integrity of numbers in chat).
53
+ Multiply a linear rope (2, 4, 8, whatever) by 5888 (Alpha 1.6, or RBF 16119.8), 6144 (Alpha 1.8, or RBF 18168.7) and even 7424 (Alpha 2.2, or RBF 22277).
54
+ This to get a further boost in max context size. Ex with Linear 8 with Alpha 1.8/RBF22277 : 8*7424 = 59392.
55
+ It's only theorical of course, but worth testing.
56
+
57
  -----
58
 
59
  Benchs of the original Q4_K_S quant I found :