Upload tinyllama-1.1b-chat-v1.0.Q4_1.gguf
Browse filesI've been working on adding [GGUF support to MLX](https://github.com/ml-explore/mlx/pull/350), and Q4_1 seems like the format that's the most aligned with MLX quantization. The quantization error is also a bit better than Q4_0 (tested with [gguf-tools](https://github.com/antirez/gguf-tools/pull/9))
- .gitattributes +1 -0
- tinyllama-1.1b-chat-v1.0.Q4_1.gguf +3 -0
.gitattributes
CHANGED
@@ -45,3 +45,4 @@ tinyllama-1.1b-chat-v1.0.Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
|
45 |
tinyllama-1.1b-chat-v1.0.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
46 |
tinyllama-1.1b-chat-v1.0.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
47 |
tinyllama-1.1b-chat-v1.0.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
|
|
|
45 |
tinyllama-1.1b-chat-v1.0.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
46 |
tinyllama-1.1b-chat-v1.0.Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
47 |
tinyllama-1.1b-chat-v1.0.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
48 |
+
tinyllama-1.1b-chat-v1.0.Q4_1.gguf filter=lfs diff=lfs merge=lfs -text
|
tinyllama-1.1b-chat-v1.0.Q4_1.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:54509f708568d36d4f3186433525340fcf47ab441f3faa87d826af04a3538268
|
3 |
+
size 702350688
|