metadata
license: apache-2.0
datasets:
- nastyboget/stackmix_hkr_large
- nastyboget/stackmix_cyrillic_large
- nastyboget/synthetic_cyrillic_large
language:
- ru
- en
pipeline_tag: image-to-text
tags:
- ocr
Model Card for TrOCR-Ru
Finetuned model microsoft/trocr-base-handwritten on large synth datasets from nastyboget.
Metrics on HKR/Cyrillic datasets
Metric | HKR_test1 | HKR_test2 | CYR_test |
---|---|---|---|
Accuracy | 60.71 | 62.48 | 58.29 |
CER | 10.47 | 8.33 | 10.30 |
WER | 34.10 | 27.86 | 38.53 |