gigapdf-ocr-handwriting β Latin/Cyrillic/Greek handwriting OCR (CRNN + CTC)
A handwriting text-line recognizer (HTR) for the gigapdf-lib
OCR engine. PaddleOCR is printed-text only, so this model adds handwriting for the Latin /
Cyrillic / Greek alphabets. It runs on RTen (pure-Rust
ONNX, no C++, no Tesseract) and is invoked opt-in (a handwriting model is overconfident on printed
input, so it is kept out of the engine's automatic printed-script selection).
Architecture
- CRNN + CTC, standard ops only: conv backbone (W/4 downsample) β height collapse β 2-layer
bidirectional
nn.LSTM(256 hidden) β CTC. The standard LSTM exports to a dynamic-width ONNXLSTMop, so the engine feeds each line at its natural width β no fixed-width padding. - Input: grayscale, height 32, ink
= 1 β gray(dark text β 1) on a 0 background, tight-cropped to the ink, tensor[1, 1, 32, W](dynamic width). - Output:
[1, T, K+1]CTC logits. Charlist: classes0..K-1= thedict.txtalphabet (Latin-extended + Cyrillic + Greek, one char per line), classK= CTC blank (last).
Training data
Trained on real handwriting line corpora (~108k lines) via the Hugging Face datasets-server
mirrors β IAM, RIMES, NorHand, NewsEye, Belfort, POPP, Esposalles (Latin) + a synthetic Cyrillic
handwriting set β plus synthetic lines rendered from text corpora Γ handwriting/print fonts
for breadth on glyphs the real corpora under-cover. Trainer:
crates/ocr-rten/tools/train_handwriting.py. The lineage traces to gigapdf's first handwriting model,
which beat Tesseract on IAM (CER 0.309); this is the clean, dynamic-width RTen re-train.
Files
| File | Use |
|---|---|
model.onnx |
ONNX graph (dynamic batch/width), opset 17 |
model.rten |
Converted for the RTen runtime (rten-convert) |
dict.txt |
alphabet, one char per line (blank is the implicit last class) |
Usage (RTen, via gigapdf-ocr-rten)
use gigapdf_ocr_rten::OcrEngine;
// Drop model.rten + dict.txt into <models_dir>/latin_hw/, then call it EXPLICITLY:
let eng = OcrEngine::load_models_dir("models")?;
let lines = eng.recognize_page_handwriting(&rgb_image)?; // opt-in
// or: eng.recognize_page_with(&rgb_image, gigapdf_ocr_rten::HANDWRITING_MODEL)?;
License
PolyForm Noncommercial 1.0.0. Copyright 2025 Rony Licha / QR Communication. Commercial use requires a separate license.