Update README.md
Browse files
README.md
CHANGED
@@ -45,9 +45,15 @@ Rope 3 (max context 12288) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b
|
|
45 |
|
46 |
Rope 3.2 (max context 13107) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,4.6679,512
|
47 |
|
|
|
48 |
|
49 |
So the linear rope, at least on this model, is flexible, and you can lower it to have the best peplexity for your max context.
|
50 |
|
|
|
|
|
|
|
|
|
|
|
51 |
-----
|
52 |
|
53 |
Benchs of the original Q4_K_S quant I found :
|
|
|
45 |
|
46 |
Rope 3.2 (max context 13107) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,4.6679,512
|
47 |
|
48 |
+
And for the adventurous, Rope 10 : (max context 40960) : WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant-AR-b1924-Q2_K.gguf,-,wikitext,7.1577,512
|
49 |
|
50 |
So the linear rope, at least on this model, is flexible, and you can lower it to have the best peplexity for your max context.
|
51 |
|
52 |
+
Then, I wonder about applying a NTK rope on the top of it to expend it further, even if it screws with the integrity of numbers in chat).
|
53 |
+
Multiply a linear rope (2, 4, 8, whatever) by 5888 (Alpha 1.6, or RBF 16119.8), 6144 (Alpha 1.8, or RBF 18168.7) and even 7424 (Alpha 2.2, or RBF 22277).
|
54 |
+
This to get a further boost in max context size. Ex with Linear 8 with Alpha 1.8/RBF22277 : 8*7424 = 59392.
|
55 |
+
It's only theorical of course, but worth testing.
|
56 |
+
|
57 |
-----
|
58 |
|
59 |
Benchs of the original Q4_K_S quant I found :
|