File size: 583 Bytes
85c6f62
 
 
ed2beb4
85c6f62
ed2beb4
 
 
 
 
85c6f62
 
b294d83
 
85c6f62
 
 
 
ed2beb4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
license: mit
pipeline_tag: text-generation
library_name: transformers.js
tags:
- ONNX
- DML
- ONNXRuntime
- nlp
- conversational
---

# Phi-3 Mini-4K-Instruct ONNX model for onnxruntime-web
This is the same models as the [official phi3 onnx model](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx) with a few changes to make it work for onnxruntime-web:

1. the model is fp16 with int4 block quantization for weights
2. the 'logits' output is fp32 
3. the model uses MHA instead of GQA
4. onnx and external data file need to stay below 2GB to be cacheable in chromium