OCR-CRNN (Printed Word Recognition)
A small CRNN + CTC model trained from scratch to read a cropped image of a printed word and output the text. ~5.3M parameters.
Results
Evaluated on held-out synthetic printed words:
| Metric | Value |
|---|---|
| Character error rate (CER) | 1.3% |
| Exact-word accuracy | 93.4% |
Architecture
- CNN downsamples a
32x160grayscale image to a sequence of 40 feature columns. - BiLSTM (2 layers) models left/right context across the sequence.
- Linear + CTC predicts a character per column over 62 classes (
0-9A-Za-z) plus a blank. - Greedy CTC decoding at inference.
Usage
import sys
from huggingface_hub import snapshot_download
repo = snapshot_download("Abulqosim0227/ocr-crnn-printed")
sys.path.insert(0, repo)
from infer import load_model, read
model, device = load_model(f"{repo}/model.safetensors")
print(read(model, device, "word.png"))
Or from a clone: python infer.py word.png
Local demo UI
A small Gradio web interface (app.py) for testing in your browser:
pip install -r requirements.txt
python app.py
Open http://127.0.0.1:7860, then upload a word image or click an example from samples/.
Limitations
- Reads single words / short strings up to 10 characters.
- Character set is
0-9A-Za-zonly — no spaces or punctuation. - Trained on synthetic printed text (DejaVu fonts). It is not trained for scene text (photos) or handwriting.
- No language model, so visually ambiguous glyphs can be confused
(e.g. capital
Ivs lowercasel). Add a language prior for production use.
Training
- Data generated on the fly: random strings rendered with random fonts + light rotation/blur/noise. No external dataset.
- 8 epochs, Adam (lr 1e-3), CTC loss, on a single GPU in a few minutes.
- Reproduce with
python train.py(requires the DejaVu fonts used indataset.py).
Files
model.safetensors weights, config.json architecture, and the full training +
inference code (alphabet.py, preprocess.py, model.py, dataset.py, train.py, infer.py).
- Downloads last month
- 116
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
