TrOCR deployment in production

by CristianJD - opened May 17

May 17

Hi, anyone know how to make the inferece time using TrOCR fastest as possible, i deploy it using docker in Openshift but it's too low, i already using ONNIX format but i can't do a quantizing , because is not implemented yet in Vision-Encoder-Decoder

abbasjivan

Aug 8

Hi, I also tried using an ONNX format but was not met with much luck with making inference faster. Using an Nvidia A10, I can get inference down to ~120ms but that's still too slow for my use case. Did you happen to find anything since quantization does not seem to work for me neither?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment