Update README.md
Browse files
README.md
CHANGED
@@ -7,6 +7,11 @@ tags:
|
|
7 |
- medieval
|
8 |
- ocr
|
9 |
- htr
|
|
|
|
|
|
|
|
|
|
|
10 |
---
|
11 |
# TrOCR Medieval Model with linemasks generated in eScriptorium (https://de.wikipedia.org/wiki/EScriptorium)
|
12 |
Base model: **microsoft/trocr-base-handwritten**
|
@@ -14,5 +19,21 @@ Base model: **microsoft/trocr-base-handwritten**
|
|
14 |
Epochs: 19.05 / 20
|
15 |
Eval CER: 0.0329
|
16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
17 |
The model has not been extensively tested.
|
18 |
Potential biases are still to be identified.
|
|
|
7 |
- medieval
|
8 |
- ocr
|
9 |
- htr
|
10 |
+
language:
|
11 |
+
- de
|
12 |
+
- fr
|
13 |
+
- la
|
14 |
+
- nl
|
15 |
---
|
16 |
# TrOCR Medieval Model with linemasks generated in eScriptorium (https://de.wikipedia.org/wiki/EScriptorium)
|
17 |
Base model: **microsoft/trocr-base-handwritten**
|
|
|
19 |
Epochs: 19.05 / 20
|
20 |
Eval CER: 0.0329
|
21 |
|
22 |
+
This is a combined model of ground truth of different **charter** and **book scripts** from a variety of projects and institutions, aiming at building a generic model for Latin scripts of the Middle Ages.
|
23 |
+
It is mainly based on documents from the project CREMMA Manuscrits médiévaux latins, HIMANIS (CNRS), Itinera Nova (Stadsarchief Leuven), and Charters and Records of Königsfelden (Universität Zürich).
|
24 |
+
|
25 |
+
Based on the following data:
|
26 |
+
CREMMA Manuscrits médiévaux latins has been produced by Clérice, Thibault and Chagué, Alix and Vlachou Efstathiou, Malamatenia. It is licensed under a CC-BY 4.0 license.
|
27 |
+
URL: https://github.com/HTR-United/CREMMA-Medieval-LAT
|
28 |
+
|
29 |
+
HIMANIS is partially published as HIMANIS Guérin produced by Stutzmann, Dominique; Hamel, Sébastien; Kernier, Iseut de; Mühlberger, Günter; Hackl, Günter. Licensed under a CC-BY 4.0 license.
|
30 |
+
DOI: 10.5281/zenodo.5535306
|
31 |
+
|
32 |
+
Charters and Records of Königsfelden Abbey and Bailiwick (1308-1662) has been produced by Halter-Pernet, Colette; Teuscher, Simon; Hodel, Tobias; Barwitzki, Lukas; Egloff, Salome; Henggeler, Fabian; Nadig, Michael; Steinmann, Anina; Stettler, Sabine; Prada Ziegler, Ismail. Licensed under a CC-BY 4.0 license.
|
33 |
+
DOI: 10.5281/zenodo.5179361
|
34 |
+
|
35 |
+
The model is based on the same data as the following PyLaia model (available on Transkribus):
|
36 |
+
https://readcoop.eu/model/charter-scripts-german-latin-french/
|
37 |
+
|
38 |
The model has not been extensively tested.
|
39 |
Potential biases are still to be identified.
|