Upload fixed ONNX weights

#3
by Xenova HF staff - opened

Makes weights compatible with Transformers.js format, and ensures both files are < 2GB.

Code to generate:

wget https://huggingface.co/schmuell/phi3-int4/resolve/main/onnx/decoder_model_merged.onnx
wget https://huggingface.co/schmuell/phi3-int4/resolve/main/onnx/decoder_model_merged.onnx.data
import onnx

with open("decoder_model_merged.onnx", "rb") as f:
    onnx_model = onnx.load(f)

file_name = 'model_q4.onnx'
onnx.save(onnx_model, file_name ,
          save_as_external_data=True,
          convert_attribute=False,
          location=file_name + '_data',
          all_tensors_to_one_file=True,
          size_threshold=10_000_000,
)
Xenova changed pull request status to merged

Sign up or log in to comment