[Request] Fix .onnx model and convert it to fp16

#1
by jspsoli - opened

Hi! I'm hosting a program (https://civitai.com/models/166561) that uses your eva02-clip-vit-large-7704 fp16 onnx model and I would like to update to this one.
The problem is there are some problems with the current .onnx model.
Input batch_size is 4 instead of 1 (which for my case is really not ideal) and there's something else going on as well.
GPU usage oscillates between 100% and 10% every second and inference takes a whole minute for a single image.
So I tried to make a conversion of the .pth model and was able to do it successfully getting the same tags (from my 10 test images) with inference speed under a second.

This is code I used for conversion:


import onnx
from PIL import Image
import torch
from torchvision.transforms import transforms

print("Loading model...")

model = torch.load(r"model.pth", map_location="cpu") # better use CPU for conversions
model.eval()
transform = transforms.Compose([
transforms.Resize((448, 448)),
transforms.ToTensor(),
transforms.Normalize(mean=[
0.48145466,
0.4578275,
0.40821073
], std=[
0.26862954,
0.26130258,
0.27577711
])
])

img = Image.open(r"image.png").convert('RGB') # can be any image I think
tensor = transform(img).unsqueeze(0).to("cpu")

print("Exporting onnx model...")

torch.onnx.export(model, # model being run
tensor, # model input (or a tuple for multiple inputs)
"new_model.onnx", # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=17, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = ['input'], # the model's input names
output_names = ['output'], # the model's output names
dynamic_axes={'input' : {0 : 'batch_size'}, # variable length axes
'output' : {0 : 'batch_size'}}
)

print("Done.")


One thing to note. This was giving me an error at first then I tried converting with 'Optimum' but gave up on that approach and went back to torch.onnx.export() which by now was asking me to install 'timm' (probably uninstalled by optimum).
After re-installing timm it worked even though I made no changes to the script - so yeah you might run into issues if your packages are not updated.

I'd be really grateful if you could re-convert your model and also convert it to fp16 and host it because I'd rather have my users download it from a reputable source than hosting it myself.
Thank you for all your models and for the time you spent on my previous 2 requests.

Owner
img = Image.open(r"image.png").convert('RGB') # can be any image I think
tensor = transform(img).unsqueeze(0).to("cpu")

For this part, I am using

input_data = torch.randn(size = (1, 3, 448, 448))

A problem is, I encountered RuntimeError: "LayerNormKernelImpl" not implemented for 'Half' when converting fp16 model on CPU. It doesn't seem like that's supported: https://github.com/pytorch/pytorch/issues/52291
Therefore, I upload an fp32 version that was converted on CPU and an fp16 version that was converted on GPU.

Thank you.
Tried it just now and its giving me the same exact outputs as the .pth model and my own conversion as well.
Not sure why the previous .onnx wasn't working right but alas this one is fine.

jspsoli changed discussion status to closed

Sign up or log in to comment