|
--- |
|
language: |
|
- en |
|
license: mit |
|
tags: |
|
- llama-cpp |
|
- gguf-my-repo |
|
- Infero |
|
- Dllama |
|
license_link: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE |
|
pipeline_tag: text-generation |
|
inference: |
|
parameters: |
|
temperature: 0.7 |
|
widget: |
|
- messages: |
|
- role: user |
|
content: Can you provide ways to eat combinations of bananas and dragonfruits? |
|
--- |
|
|
|
# tinybiggames/Phi-3-mini-4k-instruct-Q4_K_M-GGUF |
|
This model was converted to GGUF format from [`microsoft/Phi-3-mini-4k-instruct`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space. |
|
Refer to the [original model card](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) for more details on the model. |
|
## Use with tinyBigGAMES's [LMEngine Inference Library](https://github.com/tinyBigGAMES/LMEngine) |
|
|
|
|
|
How to configure LMEngine: |
|
|
|
```Delphi |
|
Config_Init( |
|
'C:/LLM/gguf', // path to model files |
|
-1 // number of GPU layer, -1 to use all available layers |
|
); |
|
``` |
|
|
|
How to define model: |
|
|
|
```Delphi |
|
Model_Define('phi-3-mini-4k-instruct.Q4_K_M.gguf', |
|
'phi3:4K:Q4KM', 4000, |
|
'<|{role}|>{content}<|end|>', |
|
'<|assistant|>'); |
|
``` |
|
|
|
How to add a message: |
|
|
|
```Delphi |
|
Message_Add( |
|
ROLE_USER, // role |
|
'What is AI?' // content |
|
); |
|
``` |
|
|
|
`{role}` - will be substituted with the message "role" |
|
`{content}` - will be substituted with the message "content" |
|
|
|
How to do inference: |
|
|
|
```Delphi |
|
var |
|
LTokenOutputSpeed: Single; |
|
LInputTokens: Int32; |
|
LOutputTokens: Int32; |
|
LTotalTokens: Int32; |
|
|
|
if Inference_Run('phi3:4K:Q4KM', 1024) then |
|
begin |
|
Inference_GetUsage(nil, @LTokenOutputSpeed, @LInputTokens, @LOutputTokens, |
|
@LTotalTokens); |
|
Console_PrintLn('', FG_WHITE); |
|
Console_PrintLn('Tokens :: Input: %d, Output: %d, Total: %d, Speed: %3.1f t/s', |
|
FG_BRIGHTYELLOW, LInputTokens, LOutputTokens, LTotalTokens, LTokenOutputSpeed); |
|
end |
|
else |
|
begin |
|
Console_PrintLn('', FG_WHITE); |
|
Console_PrintLn('Error: %s', FG_RED, Error_Get()); |
|
end; |
|
``` |