--- license: apache-2.0 --- ## Setup Instructions ### Clone the Surya OCR GitHub Repository ```bash git clone https://github.com/VikParuchuri/surya.git cd surya ``` ### Switch to v0.4.14 ```bash git checkout f7c6c04 ``` ### Install Dependencies The author has not provided requirements.txt file, but `environment.yml` from our conda environment has been uploaded, This file can be used to recreate environment for arabic_layout_model model. ### ArabicDoc Pipeline Download `ArabicDoc.cpython-310-x86_64-linux-gnu.so` , `10x_best.pt` and `surya folder` from the Repository. Place `ArabicDoc.cpython-310-x86_64-linux-gnu.so`, `10x_best.pt` and `surya folder` in same directory (They are dependent on each other). ```python from ArabicDoc import arabic_layout_model # This import will originate from ArabicDoc.cpython-310-x86_64-linux-gnu.so , which is present in the repo. Also this works with Linux based OS only. from surya.postprocessing.heatmap import draw_bboxes_on_image from PIL import Image image_path = "sample.jpg" image = Image.open(image_path) bboxes = arabic_layout_model(image_path) plotted_image = draw_bboxes_on_image(bboxes,image) ``` #### Refer to `benchmark.ipynb` for comparison between Traditional Surya Layout Model and New Layout Model. #### Refer to `results` folder to visualize images obtained from both the models.