OCR seems to be off

#4
by eelang - opened

The table and structure detection works great (especially for borderless tables), but then the OCR seems to be doing easy mistakes. Which OCR is used?

I'm using EasyOCR, which is an open-source package. There are definitely better OCRs out there, like Microsoft's Read API

Thanks for the details. You're probably aware of this, but microsoft/trocr-large-printed is hosted here on HF and seems to be doing a better job.

Sign up or log in to comment