File size: 2,051 Bytes
7433246 232d3a1 7433246 67da8d1 7433246 67da8d1 7433246 1d1c4b2 6ea9f80 7433246 c42cb87 6ea9f80 c42cb87 7433246 c42cb87 7433246 c42cb87 6ea9f80 b09615f c42cb87 7433246 c42cb87 7433246 c42cb87 6ea9f80 834331c c42cb87 7433246 c42cb87 7433246 c42cb87 6ea9f80 c42cb87 6ea9f80 c42cb87 bab81cd 6ea9f80 c42cb87 6ea9f80 bab81cd c42cb87 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
---
language:
- en
license: mit
tags:
- nlp
- code
- llama-cpp
- gguf-my-repo
- LMEngine
license_link: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE
pipeline_tag: text-generation
inference:
parameters:
temperature: 0
widget:
- messages:
- role: user
content: Can you provide ways to eat combinations of bananas and dragonfruits?
---
# tinybiggames/Phi-3-mini-4k-instruct-Q4_K_M-GGUF
This model was converted to GGUF format from [`microsoft/Phi-3-mini-4k-instruct`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
Refer to the [original model card](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) for more details on the model.
## Use with tinyBigGAMES's [Inference](https://github.com/tinyBigGAMES) Libraries.
How to configure LMEngine:
```Delphi
InitConfig(
'C:/LLM/gguf', // path to model files
-1 // number of GPU layer, -1 to use all available layers
);
```
How to define model:
```Delphi
DefineModel('phi-3-mini-4k-instruct.Q4_K_M.gguf',
'phi-3-mini-4k-instruct.Q4_K_M', 4000,
'<|{role}|>{content}<|end|>',
'<|assistant|>');
```
How to add a message:
```Delphi
AddMessage(
ROLE_USER, // role
'What is AI?' // content
);
```
`{role}` - will be substituted with the message "role"
`{content}` - will be substituted with the message "content"
How to do inference:
```Delphi
var
LTokenOutputSpeed: Single;
LInputTokens: Int32;
LOutputTokens: Int32;
LTotalTokens: Int32;
if RunInference('phi-3-mini-4k-instruct.Q4_K_M', 1024) then
begin
GetInferenceStats(nil, @LTokenOutputSpeed, @LInputTokens, @LOutputTokens,
@LTotalTokens);
PrintLn('', FG_WHITE);
PrintLn('Tokens :: Input: %d, Output: %d, Total: %d, Speed: %3.1f t/s',
FG_BRIGHTYELLOW, LInputTokens, LOutputTokens, LTotalTokens, LTokenOutputSpeed);
end
else
begin
PrintLn('', FG_WHITE);
PrintLn('Error: %s', FG_RED, GetError());
end;
``` |