gobean
/

Mixtral-8x7B-Instruct-v0.1-GGUF

Model card Files Files and versions Community

gobean commited on Apr 3

Commit

8bda1e3

•

1 Parent(s): 749c4b1

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -13,6 +13,6 @@ These are here for refence, comparison, and any future work.
 The quality of the llamafiles generated from these freshly converted GGUFs were noticeably better than those generated from the other GGUFs on HF.
 These three were most interesting because:
-q3-k-m: can fit entirely on a 4090 (24GB VRAM), very fast inference
-q4-k-m: the widely accepted standard as "good enough", but in this case it does not fit on a 4090
-q5-k-m: my favorite for smaller models, larger - provides a reference for "what if you have more than just a bit that won't fit on the gpu"

 The quality of the llamafiles generated from these freshly converted GGUFs were noticeably better than those generated from the other GGUFs on HF.
 These three were most interesting because:
+- q3-k-m: can fit entirely on a 4090 (24GB VRAM), very fast inference
+- q4-k-m: the widely accepted standard as "good enough", but in this case it does not fit on a 4090
+- q5-k-m: my favorite for smaller models, larger - provides a reference for "what if you have more than just a bit that won't fit on the gpu"