Nexesenex commited on
Commit
dd5d800
1 Parent(s): dfe0b4d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -11,6 +11,8 @@ So I made a Q8_0 out of it (best way to requantize after), and requantized it in
11
 
12
  Lowers quants (SOTA 2 bits) to come if I'm able to make an iMatrix on my config (64GB RAM).
13
 
 
 
14
  -----
15
 
16
  Edit : Due to a poor CPU (i7-6700k) for AI purpose, and only 36GB of VRAM, I remade Q3_K_S and Q2_K with an small iMatrix of ctx 32 with 25 chunks (so, 640 tokens), and it lowers the perplexity by :
 
11
 
12
  Lowers quants (SOTA 2 bits) to come if I'm able to make an iMatrix on my config (64GB RAM).
13
 
14
+ And a bonus to play with it, my KoboldCPP_-_v1.55.1.b1933_-_Frankenstein from the 21/01/2024 : https://github.com/Nexesenex/kobold.cpp/releases/tag/v1.55.1_b1933
15
+
16
  -----
17
 
18
  Edit : Due to a poor CPU (i7-6700k) for AI purpose, and only 36GB of VRAM, I remade Q3_K_S and Q2_K with an small iMatrix of ctx 32 with 25 chunks (so, 640 tokens), and it lowers the perplexity by :