Spaces:
Sleeping
Sleeping
File size: 2,887 Bytes
bcdb129 7635ace 4213a3a 7635ace 75601b5 4213a3a 75601b5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
---
title: Multiocr
emoji: ⚡
colorFrom: blue
colorTo: gray
sdk: docker
pinned: false
license: mit
---
# Multiocr
This package intends to give a common interface for multiple ocr backends
# Installation
```
pip install multiocr
```
# Supported OCR Backends
- Tesseract
- PaddleOCR
- Aws Textract
- EasyOCR
- DocTR OCR
the output for all ocr backend will be simillar
# Code Example
**Tesseract**
```python
from multiocr import OcrEngine
config = {
"lang": "eng",
"config" : "--psm 6"
}
image_file = "path/to/image.jpg"
engine = OcrEngine("tesseract", config)
text_dict = engine.text_extraction(image_file)
json = engine.text_extraction_to_json(text_dict)
df = engine.text_extraction_to_df(text_dict)
plain_text = engine.extract_plain_text(text_dict)
```
**PaddleOCR**
```python
from multiocr import OcrEngine
config = {
"lang":"en"
}
image_file = "path/to/image.jpg"
engine = OcrEngine("paddle_ocr", config)
text_dict = engine.text_extraction(image_file)
json = engine.text_extraction_to_json(text_dict)
df = engine.text_extraction_to_df(text_dict)
plain_text = engine.extract_plain_text(text_dict)
```
**Aws Textract**
```python
from multiocr import OcrEngine
config = {
"region_name":os.getenv("region_name"),
"aws_access_key_id":os.getenv("aws_access_key_id"),
"aws_secret_access_key":os.getenv("aws_secret_access_key")
}
image_file = "path/to/image.jpg"
engine = OcrEngine("aws_textract", config)
text_dict = engine.text_extraction(image_file)
json = engine.text_extraction_to_json(text_dict)
df = engine.text_extraction_to_df(text_dict)
plain_text = engine.extract_plain_text(text_dict)
```
**EasyOCR**
```python
from multiocr import OcrEngine
config = {
"lang_list": ["en"]
}
image_file = "path/to/image.jpg"
engine = OcrEngine("easy_ocr", config)
text_dict = engine.text_extraction(image_file)
json = engine.text_extraction_to_json(text_dict)
df = engine.text_extraction_to_df(text_dict)
plain_text = engine.extract_plain_text(text_dict)
print()
```
if you want to access the output of each individual ocr engine in their own raw format, we can fetch it this way
```
raw_ocr_output = engine.engine.raw_ocr
```
**config** is the each ocr's input parameters and it should be python dictionary. if not given, it'll default to each respective libraries default parameters
the input parameters for each ocr differs, and you can look at its respective repo for all allowable parameters
# Reference & Acknowlegements
- [Pytesseract](https://github.com/madmaze/pytesseract)
- [Tesseract](https://github.com/tesseract-ocr/tesseract)
- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
- [AWS Textract](https://docs.aws.amazon.com/textract/latest/dg/what-is.html)
- [EasyOCR](https://www.jaided.ai/easyocr/)
- [Doctr-Ocr](https://github.com/mindee/doctr)
**AWS Textract** will not work because it requires access_key and secret |