File size: 2,051 Bytes

7433246
 
 
 
 
232d3a1
 
7433246
 
67da8d1
7433246
 
 
 
67da8d1
7433246
 
 
 
 
1d1c4b2
 
 
 
6ea9f80
7433246
 
c42cb87
 
 
6ea9f80
c42cb87
 
 
7433246
 
c42cb87
7433246
c42cb87
6ea9f80
b09615f
c42cb87
 
7433246
 
c42cb87
7433246
c42cb87
6ea9f80
834331c
c42cb87
 
7433246
 
c42cb87
 
7433246
c42cb87
 
 
 
 
 
 
 
 
6ea9f80
c42cb87
6ea9f80
c42cb87
bab81cd
6ea9f80
 
c42cb87
 
 
6ea9f80
bab81cd
c42cb87

---
language:
- en
license: mit
tags:
- nlp
- code
- llama-cpp
- gguf-my-repo
- LMEngine
license_link: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE
pipeline_tag: text-generation
inference:
  parameters:
    temperature: 0
widget:
- messages:
  - role: user
    content: Can you provide ways to eat combinations of bananas and dragonfruits?
---

# tinybiggames/Phi-3-mini-4k-instruct-Q4_K_M-GGUF
This model was converted to GGUF format from [`microsoft/Phi-3-mini-4k-instruct`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
Refer to the [original model card](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) for more details on the model.
## Use with tinyBigGAMES's [Inference](https://github.com/tinyBigGAMES) Libraries.


How to configure LMEngine:

```Delphi
InitConfig(
 'C:/LLM/gguf', // path to model files
 -1             // number of GPU layer, -1 to use all available layers
);
```

How to define model:

```Delphi
DefineModel('phi-3-mini-4k-instruct.Q4_K_M.gguf',
  'phi-3-mini-4k-instruct.Q4_K_M', 4000,
  '<|{role}|>{content}<|end|>',
  '<|assistant|>');
```

How to add a message:

```Delphi
AddMessage(
  ROLE_USER,    // role
 'What is AI?'  // content
);
```

`{role}` - will be substituted with the message "role"  
`{content}` - will be substituted with the message "content"

How to do inference:

```Delphi
var
  LTokenOutputSpeed: Single;
  LInputTokens: Int32;
  LOutputTokens: Int32;
  LTotalTokens: Int32;
  
if RunInference('phi-3-mini-4k-instruct.Q4_K_M', 1024) then
  begin
    GetInferenceStats(nil, @LTokenOutputSpeed, @LInputTokens, @LOutputTokens,
      @LTotalTokens);
    PrintLn('', FG_WHITE);
    PrintLn('Tokens :: Input: %d, Output: %d, Total: %d, Speed: %3.1f t/s',
      FG_BRIGHTYELLOW, LInputTokens, LOutputTokens, LTotalTokens, LTokenOutputSpeed);
  end
else
  begin
    PrintLn('', FG_WHITE);
    PrintLn('Error: %s', FG_RED, GetError());
  end;
```