mlc-ai/Mistral-7B-Instruct-v0.2-q3f16_1-MLC

#1
Files changed (2) hide show
  1. README.md +0 -57
  2. mlc-chat-config.json +2 -35
README.md DELETED
@@ -1,57 +0,0 @@
1
- ---
2
- library_name: mlc-llm
3
- base_model: mistralai/Mistral-7B-Instruct-v0.2
4
- tags:
5
- - mlc-llm
6
- - web-llm
7
- ---
8
-
9
- # Mistral-7B-Instruct-v0.2-q3f16_1-MLC
10
-
11
- This is the [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) model in MLC format `q3f16_1`.
12
- The model can be used for projects [MLC-LLM](https://github.com/mlc-ai/mlc-llm) and [WebLLM](https://github.com/mlc-ai/web-llm).
13
-
14
- ## Example Usage
15
-
16
- Here are some examples of using this model in MLC LLM.
17
- Before running the examples, please install MLC LLM by following the [installation documentation](https://llm.mlc.ai/docs/install/mlc_llm.html#install-mlc-packages).
18
-
19
- ### Chat
20
-
21
- In command line, run
22
- ```bash
23
- mlc_llm chat HF://mlc-ai/Mistral-7B-Instruct-v0.2-q3f16_1-MLC
24
- ```
25
-
26
- ### REST Server
27
-
28
- In command line, run
29
- ```bash
30
- mlc_llm serve HF://mlc-ai/Mistral-7B-Instruct-v0.2-q3f16_1-MLC
31
- ```
32
-
33
- ### Python API
34
-
35
- ```python
36
- from mlc_llm import MLCEngine
37
-
38
- # Create engine
39
- model = "HF://mlc-ai/Mistral-7B-Instruct-v0.2-q3f16_1-MLC"
40
- engine = MLCEngine(model)
41
-
42
- # Run chat completion in OpenAI API.
43
- for response in engine.chat.completions.create(
44
- messages=[{"role": "user", "content": "What is the meaning of life?"}],
45
- model=model,
46
- stream=True,
47
- ):
48
- for choice in response.choices:
49
- print(choice.delta.content, end="", flush=True)
50
- print("\n")
51
-
52
- engine.terminate()
53
- ```
54
-
55
- ## Documentation
56
-
57
- For more information on MLC LLM project, please visit our [documentation](https://llm.mlc.ai/docs/) and [GitHub repo](http://github.com/mlc-ai/mlc-llm).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
mlc-chat-config.json CHANGED
@@ -15,8 +15,7 @@
15
  "sliding_window_size": 1024,
16
  "prefill_chunk_size": 128,
17
  "attention_sink_size": 4,
18
- "tensor_parallel_shards": 1,
19
- "max_batch_size": 80
20
  },
21
  "vocab_size": 32000,
22
  "context_window_size": -1,
@@ -30,39 +29,7 @@
30
  "temperature": 0.7,
31
  "repetition_penalty": 1.0,
32
  "top_p": 0.95,
33
- "conv_template": {
34
- "name": "mistral_default",
35
- "system_template": "[INST] {system_message}",
36
- "system_message": "Always assist with care, respect, and truth. Respond with utmost utility yet securely. Avoid harmful, unethical, prejudiced, or negative content. Ensure replies promote fairness and positivity.",
37
- "system_prefix_token_ids": [
38
- 1
39
- ],
40
- "add_role_after_system_message": false,
41
- "roles": {
42
- "user": "[INST]",
43
- "assistant": "[/INST]",
44
- "tool": "[INST]"
45
- },
46
- "role_templates": {
47
- "user": "{user_message}",
48
- "assistant": "{assistant_message}",
49
- "tool": "{tool_message}"
50
- },
51
- "messages": [],
52
- "seps": [
53
- " "
54
- ],
55
- "role_content_sep": " ",
56
- "role_empty_sep": "",
57
- "stop_str": [
58
- "</s>"
59
- ],
60
- "stop_token_ids": [
61
- 2
62
- ],
63
- "function_string": "",
64
- "use_function_calling": false
65
- },
66
  "pad_token_id": 0,
67
  "bos_token_id": 1,
68
  "eos_token_id": 2,
 
15
  "sliding_window_size": 1024,
16
  "prefill_chunk_size": 128,
17
  "attention_sink_size": 4,
18
+ "tensor_parallel_shards": 1
 
19
  },
20
  "vocab_size": 32000,
21
  "context_window_size": -1,
 
29
  "temperature": 0.7,
30
  "repetition_penalty": 1.0,
31
  "top_p": 0.95,
32
+ "conv_template": "mistral_default",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  "pad_token_id": 0,
34
  "bos_token_id": 1,
35
  "eos_token_id": 2,