mradermacher commited on
Commit
584b2c7
1 Parent(s): ecb01d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -48,13 +48,14 @@ And they are generally (but not always) generated in the order above, for which
48
 
49
  For models less than 11B size, I experimentally generate f16 versions at the moment (in the static repository).
50
 
51
- For models less than 19B size, imatrix IQ4_NL quants will be generated, mostly for the benefit of arm.
 
52
 
53
  The (static) IQ3 quants are no longer generated, as they consistently seem to result in *much* lower quality
54
- quants than even static Q2_K, so it would be s disservice to offer them.
55
 
56
- I specifically do not do Q2_K_S, because I generally think it is not worth it, and IQ4_NL, because it requires
57
- a lot of computing and is generally completely superseded by IQ4_XS.
58
 
59
  Q8_0 imatrix quants do not exist - some quanters claim otherwise, but Q8_0 ggufs do not contain any tensor
60
  type that uses the imatrix data, although technically it might be possible to do so.
 
48
 
49
  For models less than 11B size, I experimentally generate f16 versions at the moment (in the static repository).
50
 
51
+ For models less than 19B size, imatrix IQ4_NL quants will be generated, mostly for the benefit of arm,
52
+ where it can give a speed benefit.
53
 
54
  The (static) IQ3 quants are no longer generated, as they consistently seem to result in *much* lower quality
55
+ quants than even static Q2_K, so it would be s disservice to offer them. *Update*: That might no longer be true, and they might come back.
56
 
57
+ I specifically do not do Q2_K_S, because I generally think it is not worth it (IQ2_M usually being smaller and better, albeit slower),
58
+ and IQ4_NL, because it requires a lot of computing and is generally completely superseded by IQ4_XS.
59
 
60
  Q8_0 imatrix quants do not exist - some quanters claim otherwise, but Q8_0 ggufs do not contain any tensor
61
  type that uses the imatrix data, although technically it might be possible to do so.