Update README.md
Browse files
README.md
CHANGED
@@ -27,6 +27,9 @@ Full offload possible on 24GB VRAM with a decent context size.
|
|
27 |
- IQ2_XS SOTA
|
28 |
- Lower quality : IQ2_XXS SOTA
|
29 |
|
|
|
|
|
|
|
30 |
---
|
31 |
|
32 |
Bonus : a Kobold.CPP Frankenstein which reads IQ3_XXS models and is not affected by the Kobold.CPP 1.56/1.57 slowdown at the cost of an absent Mixtral fix.
|
|
|
27 |
- IQ2_XS SOTA
|
28 |
- Lower quality : IQ2_XXS SOTA
|
29 |
|
30 |
+
Full offload possible on 16GB VRAM with a decent context size.
|
31 |
+
- IQ1_S v2 (prefer to v1)
|
32 |
+
|
33 |
---
|
34 |
|
35 |
Bonus : a Kobold.CPP Frankenstein which reads IQ3_XXS models and is not affected by the Kobold.CPP 1.56/1.57 slowdown at the cost of an absent Mixtral fix.
|