stepfun-ai/GOT-OCR2_0 · Tesseract vs. GOT-OCR2_0: Which Performs Better for Text Extraction from Images?

Sep 25, 2024

•

edited Sep 25, 2024

I'm curious about the differences between Tesseract and GOT-OCR2_0. Which one performs better?
My main goal is to convert an image file into general Markdown format. Do you recommend using GOT-OCR2_0's plain text OCR to extract text and then applying markdownify, or using GOT-OCR2_0's formatted text OCR to extract Mathpix Markdown and convert it to general Markdown? Which approach would be more efficient, and which would you recommend?

Bigrsxx

Nov 29, 2024

I also had this question, and it was my goal to find and receive mathematical expressions from a photo. In my experience, GOT-OCR2_0 is far better at handling problems than Tesseract.

Pavankumar03

Dec 24, 2024

GOT-OCR2_0 is a better option because it is a model that can be fine-tuned on custom data. It performs well in extracting text from images even though the image has some noise as well as if it is not clear, but it has limitations when it comes to handwritten images, as it struggles to accurately extract text from them. If needed, we can fine-tune this model to improve performance.

On the other hand, Tesseract is a tool specifically designed for OCR tasks. It produces accurate output when the image is clear and free of noise. However, like GOT-OCR2_0, it also struggles to extract text accurately from handwritten images.