internlm
/

internlm2_5-7b-chat-gguf

Text Generation

Inference Endpoints

Model card Files Files and versions Community

unsubscribe commited on Jul 16, 2024

Commit

8fa9dff

·

verified ·

1 Parent(s): 2798175

add function call example

Files changed (1) hide show

README.md +45 -0

README.md CHANGED Viewed

@@ -55,6 +55,8 @@ huggingface-cli download internlm/internlm2_5-7b-chat-gguf internlm2_5-7b-chat-f
 You can use `llama-cli` for conducting inference. For a detailed explanation of `llama-cli`, please refer to [this guide](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
 ```shell
 build/bin/llama-cli \
     --model internlm2_5-7b-chat-fp16.gguf  \
@@ -76,6 +78,49 @@ build/bin/llama-cli \
     --in-suffix "<|im_end|>\n<|im_start|>assistant\n"
 ```
 ## Serving
 `llama.cpp` provides an OpenAI API compatible server - `llama-server`. You can deploy `internlm2_5-7b-chat-fp16.gguf` into a service like this:

 You can use `llama-cli` for conducting inference. For a detailed explanation of `llama-cli`, please refer to [this guide](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)
+### chat example
 ```shell
 build/bin/llama-cli \
     --model internlm2_5-7b-chat-fp16.gguf  \
     --in-suffix "<|im_end|>\n<|im_start|>assistant\n"
 ```
+### Function call example
+`llama-cli` example:
+```shell
+build/bin/llama-cli \
+    --model internlm2_5-7b-chat-fp16.gguf \
+    --predict 512 \
+    --ctx-size 4096 \
+    --gpu-layers 32 \
+    --temp 0.8 \
+    --top-p 0.8 \
+    --top-k 50 \
+    --seed 1024 \
+    --color \
+    --prompt '<|im_start|>system\nYou are InternLM2-Chat, a harmless AI assistant.<|im_end|>\n<|im_start|>system name=<|plugin|>[{"name": "get_current_weather", "parameters": {"required": ["location"], "type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "unit": {"type": "string"}}}, "description": "Get the current weather in a given location"}]<|im_end|>\n<|im_start|>user\n' \
+    --interactive \
+    --multiline-input \
+    --conversation \
+    --verbose \
+    --in-suffix "<|im_end|>\n<|im_start|>assistant\n" \
+    --special
+```
+Conversation results:
+```text
+<s><|im_start|>system
+You are InternLM2-Chat, a harmless AI assistant.<|im_end|>
+<|im_start|>system name=<|plugin|>[{"name": "get_current_weather", "parameters": {"required": ["location"], "type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "unit": {"type": "string"}}}, "description": "Get the current weather in a given location"}]<|im_end|>
+<|im_start|>user
+> I want to know today's weather in Shanghai
+I need to use the get_current_weather function to get the current weather in Shanghai.<|action_start|><|plugin|>
+{"name": "get_current_weather", "parameters": {"location": "Shanghai"}}<|action_end|>
+<|im_end|>
+> <|im_start|>environment name=<|plugin|>\n{"temperature": 22}
+The current temperature in Shanghai is 22 degrees Celsius.<|im_end|>
+>
+```
 ## Serving
 `llama.cpp` provides an OpenAI API compatible server - `llama-server`. You can deploy `internlm2_5-7b-chat-fp16.gguf` into a service like this: