Nexesenex commited on
Commit
fb803af
·
verified ·
1 Parent(s): a438733

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Custom Quants for MistralAI Mistral Large v2 123b
2
+
3
+ IQ4_XXSR, basically IQ4_XS with attn_q in IQ3_S, and attn_v in Q6_K, and token_embed in Q6_0.
4
+ Yes, you did read correctly, the last traditional quant of Ikawrakow, not available on Llama.cpp mainline.
5
+
6
+ WARNING :
7
+ Compatible with IK_Llama.cpp and Croco.cpp (my fork of the great KoboldCpp) only. I'll release .exe soon, but it works already (at least on Windows) for those who can compile.
8
+ https://github.com/Nexesenex/croco.cpp
9
+
10
+ Overall, maybe it's time for the Llama.cpp team to have a look at Ikawrakow's last work and offer terms of cooperation with him, so we can enjoy once again SOTA quants in Llama.cpp.
11
+ https://github.com/ikawrakow/ik_llama.cpp
12
+
13
+ Because the situation is becoming grotesque : we are quantizing massively models with non-SOTA quants while there is better in reach.
14
+ Thousands of terabytes of storage space, our compute and our time is wasted because of this situation.