QuantPanda
/

LongWriter-glm4-9B-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

QuantPanda commited on Aug 21

Commit

18ab389

•

1 Parent(s): d8d98ce

Update README.md

Files changed (1) hide show

README.md +30 -5

README.md CHANGED Viewed

@@ -1,5 +1,30 @@
----
-license: other
-license_name: glm-4-9b-license
-license_link: https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/LICENSE
----

+---
+license: other
+license_name: glm-4-9b-license
+license_link: https://huggingface.co/THUDM/glm-4-9b-chat/blob/main/LICENSE
+datasets:
+- THUDM/LongWriter-6k
+language:
+- en
+pipeline_tag: text-generation
+---
+# LongWriter-glm4-9b
+Original model link: https://huggingface.co/THUDM/LongWriter-glm4-9b
+Model by: **THUDM**
+Quants by: **QuantPanda**
+GGUF quantization for [llama.cpp](https://github.com/ggerganov/llama.cpp) and similar applications.
+**Example:**
+``./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation``
+If the model takes too long to load you can reduce the context size with ```--ctx-size```
+**Example with smaller context size:**
+``./llama-cli -m LongWriter-glm4-9B-Q5_K_M.gguf -p "You are a helpful AI assistant." --conversation --ctx-size 4096``