Bojun-Feng
commited on
Commit
•
14a1730
1
Parent(s):
d8d84d4
Add q8_0 model to README.md
Browse files
README.md
CHANGED
@@ -19,7 +19,7 @@ GGML files are for CPU + GPU inference using [chatglm.cpp](https://github.com/li
|
|
19 |
| chatglm2-ggml-q4_1.bin | q4_1 | 4 | 3.9 GB |
|
20 |
| chatglm2-ggml-q5_0.bin | q5_0 | 5 | 4.3 GB |
|
21 |
| chatglm2-ggml-q5_1.bin | q5_1 | 5 | 4.7 GB |
|
22 |
-
| chatglm2-ggml-
|
23 |
|
24 |
|
25 |
# How to run in xorbits-inference
|
|
|
19 |
| chatglm2-ggml-q4_1.bin | q4_1 | 4 | 3.9 GB |
|
20 |
| chatglm2-ggml-q5_0.bin | q5_0 | 5 | 4.3 GB |
|
21 |
| chatglm2-ggml-q5_1.bin | q5_1 | 5 | 4.7 GB |
|
22 |
+
| chatglm2-ggml-q8_0.bin | q8_0 | 8 | 6.6 GB |
|
23 |
|
24 |
|
25 |
# How to run in xorbits-inference
|