gobean commited on
Commit
8bda1e3
1 Parent(s): 749c4b1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -13,6 +13,6 @@ These are here for refence, comparison, and any future work.
13
  The quality of the llamafiles generated from these freshly converted GGUFs were noticeably better than those generated from the other GGUFs on HF.
14
 
15
  These three were most interesting because:
16
- q3-k-m: can fit entirely on a 4090 (24GB VRAM), very fast inference
17
- q4-k-m: the widely accepted standard as "good enough", but in this case it does not fit on a 4090
18
- q5-k-m: my favorite for smaller models, larger - provides a reference for "what if you have more than just a bit that won't fit on the gpu"
 
13
  The quality of the llamafiles generated from these freshly converted GGUFs were noticeably better than those generated from the other GGUFs on HF.
14
 
15
  These three were most interesting because:
16
+ - q3-k-m: can fit entirely on a 4090 (24GB VRAM), very fast inference
17
+ - q4-k-m: the widely accepted standard as "good enough", but in this case it does not fit on a 4090
18
+ - q5-k-m: my favorite for smaller models, larger - provides a reference for "what if you have more than just a bit that won't fit on the gpu"