metadata

base_model: microsoft/phi-2
inference: false
language:
  - en
license: other
license_link: https://huggingface.co/microsoft/phi-2/resolve/main/LICENSE
license_name: microsoft-research-license
model_creator: Microsoft
model_name: Phi 2
model_type: phi-msft
pipeline_tag: text-generation
quantized_by: Second State Inc.
tags:
  - nlp
  - code

Phi-2-GGUF

Original Model

microsoft/phi-2

Run with LlamaEdge

LlamaEdge version: v0.2.8 and above
Prompt template
- Prompt type: phi-2-instruct
- Prompt string
```
Instruct: <prompt>\nOutput:
```
Context size: 2560
Run as LlamaEdge command app
```
wasmedge --dir .:. --nn-preload default:GGML:AUTO:phi-2-Q5_K_M.gguf llama-chat.wasm -p phi-2-instruct
```
Note that phi-2 here is only used as an instruct model, instead of a chat model.

Quantized GGUF Models

Name	Quant method	Bits	Size	Use case
phi-2-Q2_K.gguf	Q2_K	2	1.11 GB	smallest, significant quality loss - not recommended for most purposes
phi-2-Q3_K_L.gguf	Q3_K_L	3	1.58 GB	small, substantial quality loss
phi-2-Q3_K_M.gguf	Q3_K_M	3	1.43 GB	very small, high quality loss
phi-2-Q3_K_S.gguf	Q3_K_S	3	1.25 GB	very small, high quality loss
phi-2-Q4_0.gguf	Q4_0	4	1.60 GB	legacy; small, very high quality loss - prefer using Q3_K_M
phi-2-Q4_K_M.gguf	Q4_K_M	4	1.74 GB	medium, balanced quality - recommended
phi-2-Q4_K_S.gguf	Q4_K_S	4	1.63 GB	small, greater quality loss
phi-2-Q5_0.gguf	Q5_0	5	1.93 GB	legacy; medium, balanced quality - prefer using Q4_K_M
phi-2-Q5_K_M.gguf	Q5_K_M	5	2.00 GB	large, very low quality loss - recommended
phi-2-Q5_K_S.gguf	Q5_K_S	5	1.93 GB	large, low quality loss - recommended
phi-2-Q6_K.gguf	Q6_K	6	2.29 GB	very large, extremely low quality loss
phi-2-Q8_0.gguf	Q8_0	8	2.96 GB	very large, extremely low quality loss - not recommended