GPU Inference Time

#3
by louispk - opened

Hello,

Thanks for the extensive documentation and experiences you have provided surrounding your usage and training of Donut :) Inspired by your work, I got a finetuned version of Donut running locally, but am trouble getting inference speeds on GPU down.

At the moment, I reach average speeds per invoice of about 4-5 seconds on a Nvidia K80, with it taking about the same time on a CPU only machine.

Do you have an idea what might be the reason for my comparatively slow inference on GPU and did you have to make any significant changes to your code to achieve the ~2 seconds you mention?

Thanks a lot again in advance!

Best,
Louis

Hi Louis,

My guestimate of 1-2 secs is based on the inference of the validation set. I did it in colab, i don't know the exact GPU, but the lower tier for sure, which could also have been a K80 or better...
Did you say that with CPU it also takes 4-5 secs for you?
The input and trained resolution has a big influence on speed.

image_size = [1920, 1280]
config = VisionEncoderDecoderConfig.from_pretrained("naver-clova-ix/donut-base")
config.encoder.image_size = image_size # (height, width)

I did not have any other specific optimizations for speed no.

Regards

Toon

to-be changed discussion status to closed

Sign up or log in to comment