document-ocr-demo / README.md
vincentlo's picture
Upload folder using huggingface_hub
2548af8 verified
metadata
title: document-ocr-demo
app_file: src/app.py
sdk: gradio
sdk_version: 4.26.0

Document OCR Demo

This repository contains a web demo for extracting key-value pairs from documents using Azure Document Intelligence.

Getting Started

Prerequisites

  • Git
  • Docker and Docker Compose

Installation

  1. Clone the repository:

    git clone git@github.com:vincent-lo-greenaitech/document-ocr-demo.git
    
  2. Configure environment variables:

    Locate the .env.template file in the repository, replace the placeholders with the actual Azure endpoint and key, and rename the file to .env.

    For example, convert:

    .env.template

    AZURE_ENDPOINT=<YOUR_ENDPOINT>
    AZURE_KEY=<YOUR_KEY>
    

    To:

    .env

    AZURE_ENDPOINT="https://123.com"
    AZURE_KEY="abc123kljfdkkvvs"
    
  3. Start the server with Docker Compose:

    Ensure docker compose is installed and available, then run the following command:

    make run
    

    Or:

    docker compose up --detach
    
  4. Access the web demo:

    The web demo should now be accessible at http://localhost:7860 (or the configured port).

Shutting Down

To shut down the server and clean up Docker containers, use the following command:

make stop

Or:

docker compose down

Update image

To update the image and run the server again, use the following command:

make update-image-and-run

Development

Managing Python Dependencies

This project uses poetry for managing Python dependencies. It is recommended to install poetry through pipx for an isolated setup. Refer to the official Poetry documentation for more detailed instructions.