How are the ONNX files for this model generated?
👋🏽 Hello!
I'm trying to use nomic-embed-text-v1.5 and was wondering how the ONNX files here were created?
I would like to optimize them for use with TensorRT but I am running into some issues that might be solved by understanding how you export the models.
Thanks for your help.
I used Optimum, and you can see how to do it here: https://github.com/huggingface/optimum/pull/1874 (still a WIP PR)
@zpn there are no errors as of now, just that a simple ONNX to TensorRT conversion doesn't lead to a very performant model as I expected.
@Xenova I'm actually the author of that PR 😄 I was asking about the conversion because I see the inputs and outputs are different in this repo and in what I get via Optimum:
- model.onnx as you generated has inputs
Just wanted to make sure as that the PR is still correct.
hmm i’ve had mixed results with TensorRT in the past. Are you able to post the onnx/tensorrt graph? i imagine that there may be a lot of unoptimized code
@zpn actually the TensorRT stuff worked out fine. There may be a lot of unoptimized code but possibly something that can be detected with https://github.com/daquexian/onnx-simplifier?
Thanks for the resource, I'll take a look! I'm sure there are a lot of unnecessary expensive ops :)
Just putting it here, another great resource for optimizing ONNX models: https://github.com/tsingmicro-toolchain/OnnxSlim