gobean commited on
Commit
24279c5
1 Parent(s): 8bda1e3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -14,5 +14,5 @@ The quality of the llamafiles generated from these freshly converted GGUFs were
14
 
15
  These three were most interesting because:
16
  - q3-k-m: can fit entirely on a 4090 (24GB VRAM), very fast inference
17
- - q4-k-m: the widely accepted standard as "good enough", but in this case it does not fit on a 4090
18
  - q5-k-m: my favorite for smaller models, larger - provides a reference for "what if you have more than just a bit that won't fit on the gpu"
 
14
 
15
  These three were most interesting because:
16
  - q3-k-m: can fit entirely on a 4090 (24GB VRAM), very fast inference
17
+ - q4-k-m: the widely accepted standard as "good enough" and general favorite for most models, but in this case it does not fit on a 4090
18
  - q5-k-m: my favorite for smaller models, larger - provides a reference for "what if you have more than just a bit that won't fit on the gpu"