TrOCR Medieval Model with linemasks generated in eScriptorium (https://de.wikipedia.org/wiki/EScriptorium)

Base model: microsoft/trocr-base-handwritten

Epochs: 19.05 / 20
Eval CER: 0.0329

This is a combined model of ground truth of different charter and book scripts from a variety of projects and institutions, aiming at building a generic model for Latin scripts of the Middle Ages. It is mainly based on documents from the project CREMMA Manuscrits médiévaux latins, HIMANIS (CNRS), Itinera Nova (Stadsarchief Leuven), and Charters and Records of Königsfelden (Universität Zürich).

Based on the following data: CREMMA Manuscrits médiévaux latins has been produced by Clérice, Thibault and Chagué, Alix and Vlachou Efstathiou, Malamatenia. It is licensed under a CC-BY 4.0 license. URL: https://github.com/HTR-United/CREMMA-Medieval-LAT

HIMANIS is partially published as HIMANIS Guérin produced by Stutzmann, Dominique; Hamel, Sébastien; Kernier, Iseut de; Mühlberger, Günter; Hackl, Günter. Licensed under a CC-BY 4.0 license. DOI: 10.5281/zenodo.5535306

Charters and Records of Königsfelden Abbey and Bailiwick (1308-1662) has been produced by Halter-Pernet, Colette; Teuscher, Simon; Hodel, Tobias; Barwitzki, Lukas; Egloff, Salome; Henggeler, Fabian; Nadig, Michael; Steinmann, Anina; Stettler, Sabine; Prada Ziegler, Ismail. Licensed under a CC-BY 4.0 license. DOI: 10.5281/zenodo.5179361

The model is based on the same data as the following PyLaia model (available on Transkribus): https://readcoop.eu/model/charter-scripts-german-latin-french/

The model has not been extensively tested. Potential biases are still to be identified.

Downloads last month
15
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.