tinybiggames
/

Phi-3-mini-4k-instruct-Q4_K_M-GGUF

@@ -3,15 +3,15 @@ language:
 - en
 license: mit
 tags:
 - llama-cpp
 - gguf-my-repo
-- Infero
-- Dllama
 license_link: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE
 pipeline_tag: text-generation
 inference:
   parameters:
-    temperature: 0.7
 widget:
 - messages:
   - role: user
@@ -21,59 +21,29 @@ widget:
 # tinybiggames/Phi-3-mini-4k-instruct-Q4_K_M-GGUF
 This model was converted to GGUF format from [`microsoft/Phi-3-mini-4k-instruct`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) for more details on the model.
-## Use with tinyBigGAMES's [LMEngine Inference Library](https://github.com/tinyBigGAMES/LMEngine)
-How to configure LMEngine:
-```Delphi
-Config_Init(
- 'C:/LLM/gguf', // path to model files
- -1             // number of GPU layer, -1 to use all available layers
-);
 ```
-How to define model:
-```Delphi
-Model_Define('phi-3-mini-4k-instruct.Q4_K_M.gguf',
-  'phi3:4K:Q4KM', 4000,
-  '<|{role}|>{content}<|end|>',
-  '<|assistant|>');
 ```
-How to add a message:
-```Delphi
-Message_Add(
-  ROLE_USER,    // role
- 'What is AI?'  // content
-);
 ```
-`{role}` - will be substituted with the message "role"
-`{content}` - will be substituted with the message "content"
-How to do inference:
-```Delphi
-var
-  LTokenOutputSpeed: Single;
-  LInputTokens: Int32;
-  LOutputTokens: Int32;
-  LTotalTokens: Int32;
-if Inference_Run('phi3:4K:Q4KM', 1024) then
-  begin
-    Inference_GetUsage(nil, @LTokenOutputSpeed, @LInputTokens, @LOutputTokens,
-      @LTotalTokens);
-    Console_PrintLn('', FG_WHITE);
-    Console_PrintLn('Tokens :: Input: %d, Output: %d, Total: %d, Speed: %3.1f t/s',
-      FG_BRIGHTYELLOW, LInputTokens, LOutputTokens, LTotalTokens, LTokenOutputSpeed);
-  end
-else
-  begin
-    Console_PrintLn('', FG_WHITE);
-    Console_PrintLn('Error: %s', FG_RED, Error_Get());
-  end;
-```

 - en
 license: mit
 tags:
+- nlp
+- code
 - llama-cpp
 - gguf-my-repo
 license_link: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE
 pipeline_tag: text-generation
 inference:
   parameters:
+    temperature: 0.0
 widget:
 - messages:
   - role: user
 # tinybiggames/Phi-3-mini-4k-instruct-Q4_K_M-GGUF
 This model was converted to GGUF format from [`microsoft/Phi-3-mini-4k-instruct`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) for more details on the model.
+## Use with llama.cpp
+Install llama.cpp through brew.
+```bash
+brew install ggerganov/ggerganov/llama.cpp
 ```
+Invoke the llama.cpp server or the CLI.
+CLI:
+```bash
+llama-cli --hf-repo tinybiggames/Phi-3-mini-4k-instruct-Q4_K_M-GGUF --model phi-3-mini-4k-instruct.Q4_K_M.gguf -p "The meaning to life and the universe is"
 ```
+Server:
+```bash
+llama-server --hf-repo tinybiggames/Phi-3-mini-4k-instruct-Q4_K_M-GGUF --model phi-3-mini-4k-instruct.Q4_K_M.gguf -c 2048
 ```
+Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
+```
+git clone https://github.com/ggerganov/llama.cpp &&             cd llama.cpp &&             make &&             ./main -m phi-3-mini-4k-instruct.Q4_K_M.gguf -n 128
+```