Nexesenex
/

WinterGoddess-1.4x-limarpv3-70B-L2-32k-Requant.GGUF

Inference Endpoints

Model card Files Files and versions Community

Nexesenex commited on Jan 21

Commit

dfe0b4d

•

1 Parent(s): 48f7fff

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -3,11 +3,11 @@ license: llama2
 ---
 Quants for Sao10K's model WinterGoddess 1.4 70b : https://huggingface.co/Sao10K/WinterGoddess-1.4x-70B-L2
-The model has been tweaked with limarvp3 and a Linear Rope 8 training to go to 32k context (with even better results in rope 4 and rope 2, maybe other lesser ropes as well)
-I don't know who did the job, only that I found this quant hanging around without FP16 : https://huggingface.co/mishima/WinterGoddess-1.4x-limarpv3-70B-L2-32k.GGUF
-So I made a Q8_0 out of it (best way to requantize after), and requantized it in Q3_K_S and Q2_K for my need.
 Lowers quants (SOTA 2 bits) to come if I'm able to make an iMatrix on my config (64GB RAM).

 ---
 Quants for Sao10K's model WinterGoddess 1.4 70b : https://huggingface.co/Sao10K/WinterGoddess-1.4x-70B-L2
+With a twist : the model I used come from a third party, and has been tweaked with limarvp3 and a Linear Rope 8 training to go to 32k context (with even better results in rope 4 and rope 2, maybe other lesser ropes as well)
+I don't know who did the job, only that I found this Q4_K_S quant of it hanging around without FP16 : https://huggingface.co/mishima/WinterGoddess-1.4x-limarpv3-70B-L2-32k.GGUF
+So I made a Q8_0 out of it (best way to requantize after), and requantized it in Q3_K_S and Q2_K for my needs.
 Lowers quants (SOTA 2 bits) to come if I'm able to make an iMatrix on my config (64GB RAM).