Shu
commited on
Commit
•
03c8f11
1
Parent(s):
a6ca77e
Update README.md
Browse filesrevised the README
add Llama.capp instructions
README.md
CHANGED
@@ -32,22 +32,62 @@ tags:
|
|
32 |
**Acknowledgement**:
|
33 |
We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
|
34 |
|
|
|
35 |
|
36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
|
38 |
```bash
|
39 |
-
|
40 |
```
|
41 |
|
42 |
-
|
|
|
|
|
|
|
43 |
|
44 |
-
|
45 |
-
|
|
|
|
|
|
|
46 |
|
47 |
-
|
|
|
|
|
|
|
48 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
```
|
50 |
-
Note that `<nexa_4>` represents the math gpt.
|
51 |
|
52 |
### Dataset and Benchmark
|
53 |
|
|
|
32 |
**Acknowledgement**:
|
33 |
We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
|
34 |
|
35 |
+
## (Recommended) Run with [llama.cpp](https://github.com/ggerganov/llama.cpp)
|
36 |
|
37 |
+
1. **Clone and compile:**
|
38 |
+
|
39 |
+
```bash
|
40 |
+
git clone https://github.com/ggerganov/llama.cpp
|
41 |
+
cd llama.cpp
|
42 |
+
|
43 |
+
# Compile the source code:
|
44 |
+
make
|
45 |
+
```
|
46 |
+
|
47 |
+
2. **Prepare the Input Prompt File:**
|
48 |
+
|
49 |
+
Navigate to the `prompt` folder inside the `llama.cpp`, and create a new file named `chat-with-octopus.txt`.
|
50 |
+
|
51 |
+
`chat-with-octopus.txt`:
|
52 |
+
|
53 |
+
```bash
|
54 |
+
# Write "User" at the top of the file to set the identifier for input.
|
55 |
+
User:
|
56 |
+
```
|
57 |
+
|
58 |
+
3. **Execute the Model:**
|
59 |
+
|
60 |
+
Run the following command in the terminal:
|
61 |
|
62 |
```bash
|
63 |
+
./main -m ./Octopus-v4-gguf/Octopus-v4-Q2_K.gguf -c 512 -b 2048 -n 256 -t 1 --repeat_penalty 1.0 --top_k 0 --top_p 1.0 --color -i -r "User:" -f prompts/chat-with-octopus.txt
|
64 |
```
|
65 |
|
66 |
+
Example prompt to interact
|
67 |
+
```bash
|
68 |
+
<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>
|
69 |
+
```
|
70 |
|
71 |
+
## Run with [Ollama](https://github.com/ollama/ollama)
|
72 |
+
1. Create a `Modelfile` in your directory and include a `FROM` statement with the path to your local model:
|
73 |
+
```bash
|
74 |
+
FROM ./path/to/octopus-v4-Q4_K_M
|
75 |
+
```
|
76 |
|
77 |
+
2. Use the following command to add the model to Ollama:
|
78 |
+
```bash
|
79 |
+
ollama create octopus-v4-Q4_K_M -f Modelfile
|
80 |
+
```
|
81 |
|
82 |
+
3. Verify that the model has been successfully imported:
|
83 |
+
```bash
|
84 |
+
ollama ls
|
85 |
+
```
|
86 |
+
|
87 |
+
### Run the model
|
88 |
+
```bash
|
89 |
+
ollama run octopus-v4-Q4_K_M "<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>"
|
90 |
```
|
|
|
91 |
|
92 |
### Dataset and Benchmark
|
93 |
|