schmuell commited on
Commit
a71a5e3
1 Parent(s): ed28baf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -1,3 +1,7 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+ converted to onnx from here https://huggingface.co/microsoft/phi-2
5
+ fp16 with weights block quantized to int4 whenever visible.
6
+ Last 3 layers are kept in fp32 to avoid fp16 overflow.
7
+ Some outputs are casted to fp32 to make it friendly for ort-web.