Commit
•
59862f1
1
Parent(s):
17491f8
Upload folder using huggingface_hub (#1)
Browse files- Upload folder using huggingface_hub (bc7807d250027f9eef27cf2e379d3088b5b0b929)
- Update README.md (83d224f67ead461218735dc12bd22ebfc16d69b5)
- Upload 3 files (1d7c3aab41c5a7c7707204e32b16ee3d75d26aee)
- Update README.md (23b7cda9f41de217e709b4e6a9d65a30629f02df)
- Update README.md (8bdbebc80e4e05b5682e99a076e63c66cc5b17d1)
Co-authored-by: Yoann Schneider <yschneider@users.noreply.huggingface.co>
README.md
CHANGED
@@ -1,3 +1,73 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: PyLaia
|
3 |
+
license: mit
|
4 |
+
tags:
|
5 |
+
- PyLaia
|
6 |
+
- PyTorch
|
7 |
+
- atr
|
8 |
+
- htr
|
9 |
+
- ocr
|
10 |
+
- historical
|
11 |
+
- handwritten
|
12 |
+
metrics:
|
13 |
+
- CER
|
14 |
+
- WER
|
15 |
+
language:
|
16 |
+
- fr
|
17 |
+
base_model: Teklia/pylaia-norhand-v3
|
18 |
+
datasets:
|
19 |
+
- Teklia/PELLET-Casimir-Marius-line
|
20 |
+
pipeline_tag: image-to-text
|
21 |
+
---
|
22 |
+
|
23 |
+
# PyLaia - PELLET Casimir Marius - Line level
|
24 |
+
|
25 |
+
This model performs Handwritten Text Recognition in French. Trained following [Teklia's tutorial](https://doc.arkindex.org/tutorial/).
|
26 |
+
|
27 |
+
## Model description
|
28 |
+
|
29 |
+
The model has been trained, for 100 epochs, using the PyLaia library on the [PELLET Casimir Marius - Line level](Teklia/PELLET-Casimir-Marius-line) dataset.
|
30 |
+
|
31 |
+
Training images were resized with a fixed height of 128 pixels, keeping the original aspect ratio.
|
32 |
+
|
33 |
+
| set | lines |
|
34 |
+
| :---- | ----: |
|
35 |
+
| train | 842 |
|
36 |
+
| val | 125 |
|
37 |
+
| test | 122 |
|
38 |
+
|
39 |
+
## Evaluation results
|
40 |
+
|
41 |
+
The model achieves the following results:
|
42 |
+
|
43 |
+
| set | CER (%) | WER (%) | text_line |
|
44 |
+
| :---- | ------: | ------: | --------: |
|
45 |
+
| train | 24.17 | 58.12 | 842 |
|
46 |
+
| val | 22.9 | 58.75 | 125 |
|
47 |
+
| test | 18.78 | 50 | 122 |
|
48 |
+
|
49 |
+
## How to use?
|
50 |
+
|
51 |
+
Please refer to the [PyLaia documentation](https://atr.pages.teklia.com/pylaia/usage/prediction/) to use this model.
|
52 |
+
|
53 |
+
## Cite us!
|
54 |
+
|
55 |
+
|
56 |
+
```bibtex
|
57 |
+
@inproceedings{pylaia2024,
|
58 |
+
author = {Tarride, Solène and Schneider, Yoann and Generali-Lince, Marie and Boillet, Mélodie and Abadie, Bastien and Kermorvant, Christopher},
|
59 |
+
title = {{Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library}},
|
60 |
+
booktitle = {Document Analysis and Recognition - ICDAR 2024},
|
61 |
+
year = {2024},
|
62 |
+
publisher = {Springer Nature Switzerland},
|
63 |
+
address = {Cham},
|
64 |
+
pages = {387--404},
|
65 |
+
isbn = {978-3-031-70549-6}
|
66 |
+
}
|
67 |
+
```
|
68 |
+
|
69 |
+
<style>
|
70 |
+
table {
|
71 |
+
width: 50%;
|
72 |
+
}
|
73 |
+
</style>
|
model
ADDED
Binary file (1.13 kB). View file
|
|
syms.txt
ADDED
@@ -0,0 +1,91 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<ctc> 0
|
2 |
+
! 1
|
3 |
+
" 2
|
4 |
+
' 3
|
5 |
+
( 4
|
6 |
+
) 5
|
7 |
+
, 6
|
8 |
+
- 7
|
9 |
+
. 8
|
10 |
+
/ 9
|
11 |
+
0 10
|
12 |
+
1 11
|
13 |
+
2 12
|
14 |
+
3 13
|
15 |
+
4 14
|
16 |
+
5 15
|
17 |
+
6 16
|
18 |
+
7 17
|
19 |
+
8 18
|
20 |
+
9 19
|
21 |
+
: 20
|
22 |
+
; 21
|
23 |
+
? 22
|
24 |
+
A 23
|
25 |
+
B 24
|
26 |
+
C 25
|
27 |
+
D 26
|
28 |
+
E 27
|
29 |
+
F 28
|
30 |
+
G 29
|
31 |
+
H 30
|
32 |
+
I 31
|
33 |
+
J 32
|
34 |
+
K 33
|
35 |
+
L 34
|
36 |
+
M 35
|
37 |
+
N 36
|
38 |
+
O 37
|
39 |
+
P 38
|
40 |
+
Q 39
|
41 |
+
R 40
|
42 |
+
S 41
|
43 |
+
T 42
|
44 |
+
U 43
|
45 |
+
V 44
|
46 |
+
W 45
|
47 |
+
X 46
|
48 |
+
Y 47
|
49 |
+
Z 48
|
50 |
+
a 49
|
51 |
+
b 50
|
52 |
+
c 51
|
53 |
+
d 52
|
54 |
+
e 53
|
55 |
+
f 54
|
56 |
+
g 55
|
57 |
+
h 56
|
58 |
+
i 57
|
59 |
+
j 58
|
60 |
+
k 59
|
61 |
+
l 60
|
62 |
+
m 61
|
63 |
+
n 62
|
64 |
+
o 63
|
65 |
+
p 64
|
66 |
+
q 65
|
67 |
+
r 66
|
68 |
+
s 67
|
69 |
+
t 68
|
70 |
+
u 69
|
71 |
+
v 70
|
72 |
+
w 71
|
73 |
+
x 72
|
74 |
+
y 73
|
75 |
+
z 74
|
76 |
+
° 75
|
77 |
+
À 76
|
78 |
+
à 77
|
79 |
+
â 78
|
80 |
+
ç 79
|
81 |
+
è 80
|
82 |
+
é 81
|
83 |
+
ê 82
|
84 |
+
ë 83
|
85 |
+
ô 84
|
86 |
+
ù 85
|
87 |
+
û 86
|
88 |
+
œ 87
|
89 |
+
’ 88
|
90 |
+
<unk> 89
|
91 |
+
<space> 90
|
weights.ckpt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c0c8bc3d52d8c524b70bec2aa83c3817a4b6c5b458c904c8ba24bf592af19eb8
|
3 |
+
size 21545288
|