to-be/invoice_document_headers_extraction_with_donut

May 11, 2023

Hello,

Thanks for the extensive documentation and experiences you have provided surrounding your usage and training of Donut :) Inspired by your work, I got a finetuned version of Donut running locally, but am trouble getting inference speeds on GPU down.

At the moment, I reach average speeds per invoice of about 4-5 seconds on a Nvidia K80, with it taking about the same time on a CPU only machine.

Do you have an idea what might be the reason for my comparatively slow inference on GPU and did you have to make any significant changes to your code to achieve the ~2 seconds you mention?

Thanks a lot again in advance!

Best,
Louis

to-be

Owner May 12, 2023

Hi Louis,

My guestimate of 1-2 secs is based on the inference of the validation set. I did it in colab, i don't know the exact GPU, but the lower tier for sure, which could also have been a K80 or better...
Did you say that with CPU it also takes 4-5 secs for you?
The input and trained resolution has a big influence on speed.

image_size = [1920, 1280]
config = VisionEncoderDecoderConfig.from_pretrained("naver-clova-ix/donut-base")
config.encoder.image_size = image_size # (height, width)

I did not have any other specific optimizations for speed no.

Regards

Toon

to-be changed discussion status to closed Oct 25, 2023

Spaces:
Duplicated from nielsr/donut-cord

to-be
/

invoice_document_headers_extraction_with_donut

Running

GPU Inference Time