gigapdf-ocr-handwriting β€” Latin/Cyrillic/Greek handwriting OCR (CRNN + CTC)

A handwriting text-line recognizer (HTR) for the gigapdf-lib OCR engine. PaddleOCR is printed-text only, so this model adds handwriting for the Latin / Cyrillic / Greek alphabets. It runs on RTen (pure-Rust ONNX, no C++, no Tesseract) and is invoked opt-in (a handwriting model is overconfident on printed input, so it is kept out of the engine's automatic printed-script selection).

Architecture

  • CRNN + CTC, standard ops only: conv backbone (W/4 downsample) β†’ height collapse β†’ 2-layer bidirectional nn.LSTM (256 hidden) β†’ CTC. The standard LSTM exports to a dynamic-width ONNX LSTM op, so the engine feeds each line at its natural width β€” no fixed-width padding.
  • Input: grayscale, height 32, ink = 1 βˆ’ gray (dark text β†’ 1) on a 0 background, tight-cropped to the ink, tensor [1, 1, 32, W] (dynamic width).
  • Output: [1, T, K+1] CTC logits. Charlist: classes 0..K-1 = the dict.txt alphabet (Latin-extended + Cyrillic + Greek, one char per line), class K = CTC blank (last).

Training data

Trained on real handwriting line corpora (~108k lines) via the Hugging Face datasets-server mirrors β€” IAM, RIMES, NorHand, NewsEye, Belfort, POPP, Esposalles (Latin) + a synthetic Cyrillic handwriting set β€” plus synthetic lines rendered from text corpora Γ— handwriting/print fonts for breadth on glyphs the real corpora under-cover. Trainer: crates/ocr-rten/tools/train_handwriting.py. The lineage traces to gigapdf's first handwriting model, which beat Tesseract on IAM (CER 0.309); this is the clean, dynamic-width RTen re-train.

Files

File Use
model.onnx ONNX graph (dynamic batch/width), opset 17
model.rten Converted for the RTen runtime (rten-convert)
dict.txt alphabet, one char per line (blank is the implicit last class)

Usage (RTen, via gigapdf-ocr-rten)

use gigapdf_ocr_rten::OcrEngine;
// Drop model.rten + dict.txt into <models_dir>/latin_hw/, then call it EXPLICITLY:
let eng = OcrEngine::load_models_dir("models")?;
let lines = eng.recognize_page_handwriting(&rgb_image)?;          // opt-in
// or: eng.recognize_page_with(&rgb_image, gigapdf_ocr_rten::HANDWRITING_MODEL)?;

License

PolyForm Noncommercial 1.0.0. Copyright 2025 Rony Licha / QR Communication. Commercial use requires a separate license.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support