metadata

title: document-ocr-demo
app_file: src/app.py
sdk: gradio
sdk_version: 4.26.0

Document OCR Demo

This repository contains a web demo for extracting key-value pairs from documents using Azure Document Intelligence.

Getting Started

Prerequisites

Git
Docker and Docker Compose

Installation

Clone the repository:

git clone git@github.com:vincent-lo-greenaitech/document-ocr-demo.git

Configure environment variables:

Locate the .env.template file in the repository, replace the placeholders with the actual Azure endpoint and key, and rename the file to .env.

For example, convert:

.env.template
```
AZURE_ENDPOINT=<YOUR_ENDPOINT>
AZURE_KEY=<YOUR_KEY>
```
To:

.env
```
AZURE_ENDPOINT="https://123.com"
AZURE_KEY="abc123kljfdkkvvs"
```
Start the server with Docker Compose:

Ensure docker compose is installed and available, then run the following command:
```
make run
```
Or:
```
docker compose up --detach
```
Access the web demo:

The web demo should now be accessible at http://localhost:7860 (or the configured port).

Shutting Down

To shut down the server and clean up Docker containers, use the following command:

make stop

Or:

docker compose down

Update image

To update the image and run the server again, use the following command:

make update-image-and-run

Development

Managing Python Dependencies

This project uses poetry for managing Python dependencies. It is recommended to install poetry through pipx for an isolated setup. Refer to the official Poetry documentation for more detailed instructions.