npc0
/

chatglm3-6b-32k-int4

Model card Files Files and versions Community

npc0 commited on Nov 23, 2023

Commit

d5f672b

•

1 Parent(s): 569022b

Create README.md

Files changed (1) hide show

README.md +42 -0

README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+language:
+- zh
+- en
+tags:
+- glm
+- chatglm
+- ggml
+---
+# ChatGLM3-6B-32k-int4
+介绍 (Introduction)
+ChatGLM3-6B-32k 是 ChatGLM 系列最新一代的开源模型，[THUDM/chatglm3-6b](https://github.com/THUDM/ChatGLM3)
+用 [ChatGLM.CPP](https://github.com/li-plus/chatglm.cpp) 基於 GGML quantize 生成 Q4_0、Q4_1 權重 weights 儲存於此倉庫。
+## Performance
+|Model                     |GGML quantize method| HDD size |1 token\*|
+|--------------------------|--------------------|----------|---------|
+|chatglm3-32k-ggml-q4_0.bin|        q4_0        |  ?.?? GB |  ???ms  |
+|chatglm3-32k-ggml-q4_1.bin|        q4_1        |  ?.?? GB |  ???ms  |
+\* ms/token (CPU @ Platinum 8260) from [reference](https://github.com/li-plus/chatglm.cpp#performance)
+## Getting Started
+1. Install dependency
+  ```sh
+  pip install chatglm-cpp transformers
+  ```
+2. Download weight
+  ```sh
+  wget https://huggingface.co/npc0/chatglm3-6b-fp16/resolve/main/chatglm3-32k-ggml-q4_0.bin
+  ```
+3. Code
+  ```py
+  import chatglm_cpp
+  pipeline = chatglm_cpp.Pipeline("./chatglm3-32k-ggml-q4_0.bin")
+  pipeline.chat(["你好"])
+  # Output: 你好👋！我是人工智能助手 ChatGLM3-6B，很高兴见到你，欢迎问我任何问题。
+  ```