Nexesenex
/

WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant.GGUF

Inference Endpoints

Model card Files Files and versions Community

Nexesenex commited on Jan 22

Commit

482bcee

•

1 Parent(s): 2441e06

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -55,6 +55,8 @@ And for the adventurous, Rope 10 : (max context 40960) : WinterGoddess-1.4x-lima
 So the linear rope, at least on this model, is flexible, and you can lower it to have the best peplexity for your max context.
 Then, I wonder about applying a NTK rope on the top of it to expend it further, even if it screws with the integrity of numbers in chat).
 Multiply a linear rope (2, 4, 8, whatever) by 5888 (Alpha 1.6, or RBF 16119.8), 6144 (Alpha 1.8, or RBF 18168.7) and even 7424 (Alpha 2.2, or RBF 22277).
 This to get a further boost in max context size. Ex with Linear 8 with Alpha 1.8/RBF22277 : 8*7424 = 59392.

 So the linear rope, at least on this model, is flexible, and you can lower it to have the best peplexity for your max context.
+All these results are reproducible with lowers deltas between them for Q3_K_S, and I suppose for other quants as well.
 Then, I wonder about applying a NTK rope on the top of it to expend it further, even if it screws with the integrity of numbers in chat).
 Multiply a linear rope (2, 4, 8, whatever) by 5888 (Alpha 1.6, or RBF 16119.8), 6144 (Alpha 1.8, or RBF 18168.7) and even 7424 (Alpha 2.2, or RBF 22277).
 This to get a further boost in max context size. Ex with Linear 8 with Alpha 1.8/RBF22277 : 8*7424 = 59392.