NexaAIDev
/

octo-net-gguf

function calling

on-device language model

Model card Files Files and versions Community

Shu commited on May 8

Commit

03c8f11

•

1 Parent(s): a6ca77e

Update README.md

revised the README
add Llama.capp instructions

Files changed (1) hide show

README.md +47 -7

README.md CHANGED Viewed

@@ -32,22 +32,62 @@ tags:
 **Acknowledgement**:
 We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
-## Run with [Ollama](https://github.com/ollama/ollama)
 ```bash
-ollama run NexaAIDev/octopus-v4-q4_k_m
 ```
-Input example:
-```json
-Query: Tell me the result of derivative of x^3 when x is 2?
-Response: <nexa_4> ('Determine the derivative of the function f(x) = x^3 at the point where x equals 2, and interpret the result within the context of rate of change and tangent slope.')<nexa_end>
 ```
-Note that `<nexa_4>` represents the math gpt.
 ### Dataset and Benchmark

 **Acknowledgement**:
 We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
+## (Recommended) Run with [llama.cpp](https://github.com/ggerganov/llama.cpp)
+1. **Clone and compile:**
+```bash
+   git clone https://github.com/ggerganov/llama.cpp
+   cd llama.cpp
+   # Compile the source code:
+   make
+```
+2. **Prepare the Input Prompt File:**
+Navigate to the `prompt` folder inside the `llama.cpp`, and create a new file named `chat-with-octopus.txt`.
+   `chat-with-octopus.txt`:
+```bash
+   # Write "User" at the top of the file to set the identifier for input.
+   User:
+```
+3. **Execute the Model:**
+Run the following command in the terminal:
 ```bash
+   ./main -m ./Octopus-v4-gguf/Octopus-v4-Q2_K.gguf -c 512 -b 2048 -n 256 -t 1 --repeat_penalty 1.0 --top_k 0 --top_p 1.0 --color -i -r "User:" -f prompts/chat-with-octopus.txt
 ```
+Example prompt to interact
+```bash
+  <|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>
+```
+## Run with [Ollama](https://github.com/ollama/ollama)
+1. Create a `Modelfile` in your directory and include a `FROM` statement with the path to your local model:
+```bash
+FROM ./path/to/octopus-v4-Q4_K_M
+```
+2. Use the following command to add the model to Ollama:
+```bash
+ollama create octopus-v4-Q4_K_M -f Modelfile
+```
+3. Verify that the model has been successfully imported:
+```bash
+ollama ls
+```
+### Run the model
+```bash
+ollama run octopus-v4-Q4_K_M "<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>"
 ```
 ### Dataset and Benchmark