Xenova
/

Phi-3-mini-4k-instruct

Text Generation

Transformers.js

Model card Files Files and versions Community

Phi-3-mini-4k-instruct / README.md

Xenova's picture

Xenova HF staff

Duplicate from schmuell/phi3-int4

85c6f62 verified 6 months ago

|

588 Bytes

	---
	license: mit
	pipeline_tag: text-generation
	tags:
	- ONNX
	- DML
	- ONNXRuntime
	- phi3
	- nlp
	- conversational
	- custom_code
	---

	# Phi-3 Mini-128K-Instruct ONNX model for onnxruntime-web
	This is the same models as the [official phi3 onnx model](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx) with a few changes to make it work for onnxruntime-web:

	1. the model is fp16 with int4 block quantization for weights
	2. the 'logits' output is fp32
	3. the model uses MHA instead of GQA
	4. onnx and external data file need to stay below 2GB to be cacheable in chromium