File size: 1,786 Bytes
d13bba3
1cb898e
d13bba3
1cb898e
 
 
18ad9e7
 
 
fc75289
18ad9e7
1cb898e
 
 
 
 
334c9df
 
18ad9e7
d13bba3
1cb898e
18ad9e7
1cb898e
334c9df
1cb898e
 
 
9c22c3e
1cb898e
9c22c3e
334c9df
9c22c3e
 
f031eb1
 
 
 
334c9df
1cb898e
 
 
 
 
9c22c3e
c4fbaf9
 
 
1cb898e
c4fbaf9
1cb898e
9c22c3e
334c9df
c4fbaf9
334c9df
 
9c22c3e
 
 
 
 
 
 
 
 
334c9df
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
library_name: PyLaia
license: mit
tags:
- PyLaia
- PyTorch
- atr
- htr
- ocr
- modern
- handwritten
metrics:
- CER
- WER
language:
- en
datasets:
- Teklia/IAM
pipeline_tag: image-to-text
---

# PyLaia - IAM

This model performs Handwritten Text Recognition in English on modern documents.

## Model description

The model was trained using the PyLaia library on the RWTH split of the [IAM](https://fki.tic.heia-fr.ch/databases/iam-handwriting-database) dataset.

Training images were resized with a fixed height of 128 pixels, keeping the original aspect ratio.

| set | lines | 
| :--- | ------: | 
| train |  6,482  |
| val   |   976   |
| test  |  2,915  |

An external 6-gram character language model can be used to improve recognition. The language model is trained on the text from the IAM training set.

## Evaluation results

The model achieves the following results:

| set   | Language model | CER (%)    | WER (%) | lines   |
|:------|:---------------| ----------:| -------:|----------:|
| test  | no             | 8.44       | 24.51   |     2,915 |
| test  | yes            | 7.50       | 20.98   |     2,915 |

## How to use?

Please refer to the [PyLaia documentation](https://atr.pages.teklia.com/pylaia/usage/prediction/) to use this model.

## Cite us!

```bibtex
@inproceedings{pylaia2024,
    author = {Tarride, Solène and Schneider, Yoann and Generali-Lince, Marie and Boillet, Mélodie and Abadie, Bastien and Kermorvant, Christopher},
    title = {{Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library}},
    booktitle = {Document Analysis and Recognition - ICDAR 2024},
    year = {2024},
    publisher = {Springer Nature Switzerland},
    address = {Cham},
    pages = {387--404},
    isbn = {978-3-031-70549-6}
}
```