Hekaisheng commited on
Commit
344edc2
1 Parent(s): 400b21e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md CHANGED
@@ -1,3 +1,50 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ ## chatglm3-ggml
5
+
6
+ This repo contains GGML format model files for chatglm3-6B.
7
+
8
+ ### Example code
9
+
10
+ #### Install packages
11
+ ```bash
12
+ pip install xinference[ggml]>=0.4.3
13
+ ```
14
+ If you want to run with GPU acceleration, refer to [installation](https://github.com/xorbitsai/inference#installation).
15
+
16
+ #### Start a local instance of Xinference
17
+ ```bash
18
+ xinference -p 9997
19
+ ```
20
+
21
+ #### Launch and inference
22
+ ```python
23
+ from xinference.client import Client
24
+
25
+ client = Client("http://localhost:9997")
26
+ model_uid = client.launch_model(
27
+ model_name="chatglm3",
28
+ model_format="ggmlv3",
29
+ model_size_in_billions=6,
30
+ quantization="q4_0",
31
+ )
32
+ model = client.get_model(model_uid)
33
+
34
+ chat_history = []
35
+ prompt = "最大的动物是什么?"
36
+ model.chat(
37
+ prompt,
38
+ chat_history,
39
+ generate_config={"max_tokens": 1024}
40
+ )
41
+ ```
42
+
43
+ ### More information
44
+
45
+ [Xinference](https://github.com/xorbitsai/inference) Replace OpenAI GPT with another LLM in your app
46
+ by changing a single line of code. Xinference gives you the freedom to use any LLM you need.
47
+ With Xinference, you are empowered to run inference with any open-source language models,
48
+ speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
49
+
50
+ <i><a href="https://join.slack.com/t/xorbitsio/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA">👉 Join our Slack community!</a></i>