OCR-CRNN (Printed Word Recognition)

A small CRNN + CTC model trained from scratch to read a cropped image of a printed word and output the text. ~5.3M parameters.

Results

Evaluated on held-out synthetic printed words:

Metric Value
Character error rate (CER) 1.3%
Exact-word accuracy 93.4%

Architecture

  • CNN downsamples a 32x160 grayscale image to a sequence of 40 feature columns.
  • BiLSTM (2 layers) models left/right context across the sequence.
  • Linear + CTC predicts a character per column over 62 classes (0-9A-Za-z) plus a blank.
  • Greedy CTC decoding at inference.

Usage

import sys
from huggingface_hub import snapshot_download

repo = snapshot_download("Abulqosim0227/ocr-crnn-printed")
sys.path.insert(0, repo)
from infer import load_model, read

model, device = load_model(f"{repo}/model.safetensors")
print(read(model, device, "word.png"))

Or from a clone: python infer.py word.png

Local demo UI

A small Gradio web interface (app.py) for testing in your browser:

pip install -r requirements.txt
python app.py

Open http://127.0.0.1:7860, then upload a word image or click an example from samples/.

example

Limitations

  • Reads single words / short strings up to 10 characters.
  • Character set is 0-9A-Za-z only — no spaces or punctuation.
  • Trained on synthetic printed text (DejaVu fonts). It is not trained for scene text (photos) or handwriting.
  • No language model, so visually ambiguous glyphs can be confused (e.g. capital I vs lowercase l). Add a language prior for production use.

Training

  • Data generated on the fly: random strings rendered with random fonts + light rotation/blur/noise. No external dataset.
  • 8 epochs, Adam (lr 1e-3), CTC loss, on a single GPU in a few minutes.
  • Reproduce with python train.py (requires the DejaVu fonts used in dataset.py).

Files

model.safetensors weights, config.json architecture, and the full training + inference code (alphabet.py, preprocess.py, model.py, dataset.py, train.py, infer.py).

Downloads last month
116
Safetensors
Model size
5.33M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support