mradermacher
commited on
Commit
•
584b2c7
1
Parent(s):
ecb01d7
Update README.md
Browse files
README.md
CHANGED
@@ -48,13 +48,14 @@ And they are generally (but not always) generated in the order above, for which
|
|
48 |
|
49 |
For models less than 11B size, I experimentally generate f16 versions at the moment (in the static repository).
|
50 |
|
51 |
-
For models less than 19B size, imatrix IQ4_NL quants will be generated, mostly for the benefit of arm
|
|
|
52 |
|
53 |
The (static) IQ3 quants are no longer generated, as they consistently seem to result in *much* lower quality
|
54 |
-
quants than even static Q2_K, so it would be s disservice to offer them.
|
55 |
|
56 |
-
I specifically do not do Q2_K_S, because I generally think it is not worth it
|
57 |
-
a lot of computing and is generally completely superseded by IQ4_XS.
|
58 |
|
59 |
Q8_0 imatrix quants do not exist - some quanters claim otherwise, but Q8_0 ggufs do not contain any tensor
|
60 |
type that uses the imatrix data, although technically it might be possible to do so.
|
|
|
48 |
|
49 |
For models less than 11B size, I experimentally generate f16 versions at the moment (in the static repository).
|
50 |
|
51 |
+
For models less than 19B size, imatrix IQ4_NL quants will be generated, mostly for the benefit of arm,
|
52 |
+
where it can give a speed benefit.
|
53 |
|
54 |
The (static) IQ3 quants are no longer generated, as they consistently seem to result in *much* lower quality
|
55 |
+
quants than even static Q2_K, so it would be s disservice to offer them. *Update*: That might no longer be true, and they might come back.
|
56 |
|
57 |
+
I specifically do not do Q2_K_S, because I generally think it is not worth it (IQ2_M usually being smaller and better, albeit slower),
|
58 |
+
and IQ4_NL, because it requires a lot of computing and is generally completely superseded by IQ4_XS.
|
59 |
|
60 |
Q8_0 imatrix quants do not exist - some quanters claim otherwise, but Q8_0 ggufs do not contain any tensor
|
61 |
type that uses the imatrix data, although technically it might be possible to do so.
|