Pytesseract related error

#3
by yagcaglar - opened

Hi, while I was trying the given code in model card I faced with an error:
CalledProcessError: Command '['/home/ubuntu/docquery/env/bin/pytesseract', '--version']' returned non-zero exit status 1.
where the specified address is the path of pytesseract that I set with:
pytesseract.pytesseract.tesseract_cmd = '/home/ubuntu/docquery/env/lib/python3.8/site-packages/pytesseract'

My pytesseract version is '0.3.10' and python version is '3.8.10'. What should I do to solve the problem? Thanks in advance.

Impira org

Hi @yagcaglar , this looks like an issue with how you've installed tesseract on your machine. I assume if you run /home/ubuntu/docquery/env/bin/pytesseract --version the command fails?

Yes, it fails. But I am able to find the version of the pytesseract with python. Also when I run "pip install pytesseract" command it says "requirement already sattisfied."with the address that I have specified in tesseract_cmd. However, the code tries to find the package in different path. There exists a file containing these lines of code in the path:

#!/home/ubuntu/docquery/env/bin/python3

-- coding: utf-8 --

import re
import sys
from pytesseract.pytesseract import main
if name == 'main':
sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0])
sys.exit(main())

If I installed it wrong, how should I install correctly :)

Impira org

Hi @yagcaglar , I'd recommend you file an issue with the pytesseract library so that their community can help you resolve the installation issue.

Alternatively, you can use our DocQuery library, which supports other OCR options like EasyOCR.

Thanks for the help :)

yagcaglar changed discussion status to closed

Sign up or log in to comment