eformat
/

granite-3.0-8b-instruct-Q4_K_M-GGUF

Text Generation

Model card Files Files and versions Community

eformat commited on 29 days ago

Commit

b87ae9a

•

1 Parent(s): 54ea565

Update README.md

Files changed (1) hide show

README.md +33 -1

README.md CHANGED Viewed

@@ -12,4 +12,36 @@ base_model: ibm-granite/granite-3.0-8b-instruct
 ---
-# eformat/granite-3.0-8b-instruct-Q4_K_M-GGUF

 ---
+# eformat/granite-3.0-8b-instruct-Q4_K_M-GGUF
+Not all tools (vllm, llama.cpp) seem to support the new model config params it seems (25/10/2024).
+```json
+# config.json
+"model_type": "granite"
+"architectures": [
+  "GraniteForCausalLM"
+]
+```
+This gguf conversion done using old ones
+```json
+# config.json
+"model_type": "llama"
+"architectures": [
+  "LlamaForCausalLM"
+]
+```
+This gguf loads OK - tested using:
+```bash
+# llama.cpp
+./llama-server --verbose --gpu-layers 99999 --parallel 2 --ctx-size 4096 -m ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
+```
+```bash
+# vllm
+vllm serve ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
+```