ONNX Conversion Tutorial

#3
by qnguyen3 - opened

Hi @Xenova , thank you for your awesome work.
I recently fine-tuned this model for information extraction from images using JSON Schema, with the intention of embedding it into a web application. I was wondering if you could recommend any existing tutorials that would guide me through the process of converting the model into the ONNX format. This would enable me to perform the conversion independently in the future. Thank you for your awesome work!

I must admit, the current process to export the model is a bit complicated, and is very manual/hacky at the moment... I'll eventually turn it into a script, but in the meantime, just ping me and I'd be happy to help out with it.

There are 3 components:

  1. Vision model + multimodal projection (vision_encoder.onnx)
  2. Embedding layer (embed_tokens.onnx)
  3. Language model without embedding layer (decoder_model_merged.onnx)

Sign up or log in to comment