K024
/

chatglm2-6b-int4g32

Inference Endpoints

Model card Files Files and versions Community

K024 commited on Jul 13, 2023

Commit

ecf83ca

•

1 Parent(s): 998744f

Create README.md

Files changed (1) hide show

README.md +31 -0

README.md ADDED Viewed

	@@ -0,0 +1,31 @@

+---
+language:
+  - zh
+  - en
+tags:
+  - glm
+  - chatglm
+  - thudm
+---
+# ChatGLM2 6b int4 g32 量化模型
+详情参考 [K024/chatglm-q](https://github.com/K024/chatglm-q)。
+See [K024/chatglm-q](https://github.com/K024/chatglm-q) for more details.
+```python
+import torch
+from chatglm_q.decoder import ChatGLMDecoder, chat_template
+device = torch.device("cuda")
+decoder = ChatGLMDecoder.from_pretrained("K024/chatglm2-6b-int4g32", device=device)
+prompt = chat_template([], "我是谁？")
+for text in decoder.generate(prompt):
+    print(text)
+```
+模型权重按 ChatGLM2-6b 许可发布，见 [MODEL LICENSE](https://huggingface.co/THUDM/chatglm2-6b/blob/main/MODEL_LICENSE)。
+Model weights are released under the same license as ChatGLM2-6b, see [MODEL LICENSE](https://huggingface.co/THUDM/chatglm2-6b/blob/main/MODEL_LICENSE).