CharlieFRuan commited on
Commit
d0ae86d
1 Parent(s): f35b1f2

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: mlc-llm
3
+ base_model: Qwen/Qwen2.5-7B-Instruct
4
+ tags:
5
+ - mlc-llm
6
+ - web-llm
7
+ ---
8
+
9
+ # Qwen2.5-7B-Instruct-q0f16-MLC
10
+
11
+ This is the [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) model in MLC format `q0f16`.
12
+ The model can be used for projects [MLC-LLM](https://github.com/mlc-ai/mlc-llm) and [WebLLM](https://github.com/mlc-ai/web-llm).
13
+
14
+ ## Example Usage
15
+
16
+ Here are some examples of using this model in MLC LLM.
17
+ Before running the examples, please install MLC LLM by following the [installation documentation](https://llm.mlc.ai/docs/install/mlc_llm.html#install-mlc-packages).
18
+
19
+ ### Chat
20
+
21
+ In command line, run
22
+ ```bash
23
+ mlc_llm chat HF://mlc-ai/Qwen2.5-7B-Instruct-q0f16-MLC
24
+ ```
25
+
26
+ ### REST Server
27
+
28
+ In command line, run
29
+ ```bash
30
+ mlc_llm serve HF://mlc-ai/Qwen2.5-7B-Instruct-q0f16-MLC
31
+ ```
32
+
33
+ ### Python API
34
+
35
+ ```python
36
+ from mlc_llm import MLCEngine
37
+
38
+ # Create engine
39
+ model = "HF://mlc-ai/Qwen2.5-7B-Instruct-q0f16-MLC"
40
+ engine = MLCEngine(model)
41
+
42
+ # Run chat completion in OpenAI API.
43
+ for response in engine.chat.completions.create(
44
+ messages=[{"role": "user", "content": "What is the meaning of life?"}],
45
+ model=model,
46
+ stream=True,
47
+ ):
48
+ for choice in response.choices:
49
+ print(choice.delta.content, end="", flush=True)
50
+ print("\n")
51
+
52
+ engine.terminate()
53
+ ```
54
+
55
+ ## Documentation
56
+
57
+ For more information on MLC LLM project, please visit our [documentation](https://llm.mlc.ai/docs/) and [GitHub repo](http://github.com/mlc-ai/mlc-llm).