gobean commited on
Commit
b271dc3
1 Parent(s): b2c06e8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -16,7 +16,7 @@ These are here for reference, comparison, and any future work.
16
 
17
  The quality of the llamafiles generated from these freshly converted GGUFs were noticeably better than those generated from the other GGUFs on HF.
18
 
19
- These three were most interesting because:
20
  - q3-k-m: can fit entirely on a 4090 (24GB VRAM), very fast inference
21
  - q4-0: for some reason, this is better quality than q4-k-m.
22
  - q4-k-m: the widely accepted standard as "good enough" and general favorite for most models, but in this case it does not fit on a 4090
 
16
 
17
  The quality of the llamafiles generated from these freshly converted GGUFs were noticeably better than those generated from the other GGUFs on HF.
18
 
19
+ quant file notes:
20
  - q3-k-m: can fit entirely on a 4090 (24GB VRAM), very fast inference
21
  - q4-0: for some reason, this is better quality than q4-k-m.
22
  - q4-k-m: the widely accepted standard as "good enough" and general favorite for most models, but in this case it does not fit on a 4090