File size: 2,723 Bytes
08664be
9e9ba0f
 
08664be
96eaee8
 
241262e
15e7972
08664be
 
f43d883
08664be
 
 
 
 
 
 
 
 
 
0065e6f
08664be
 
0065e6f
37fdea4
6d7c43b
37fdea4
08664be
 
 
 
37fdea4
 
 
 
 
 
 
 
 
684a8f5
 
37fdea4
 
 
 
08664be
 
 
37fdea4
08664be
 
 
 
 
 
 
 
 
0065e6f
08664be
 
 
 
 
 
353c6f9
 
96eaee8
08664be
 
 
 
 
 
0065e6f
 
 
 
 
 
 
 
 
96eaee8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
language:
- de
library_name: transformers
datasets:
- fhswf/german_handwriting
license: afl-3.0
pipeline_tag: image-to-text
---

# Model Card for TrOCR_german_handwritten

<!-- Provide a quick summary of what the model is/does. -->



## Model Details


<!-- Provide a longer summary of what this model is. -->

TrOCR model fine-tuned on the [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting). It was introduced in the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Li et al. and first released in [this repository](https://github.com/microsoft/unilm/tree/master/trocr).

- **Developed by:** [More Information Needed]
- **Model type:** Transformer OCR
- **Language(s) (NLP):** German
- **License:** afl-3.0
- **Finetuned from model [optional]:** [TrOCR_large_handwritten](https://huggingface.co/microsoft/trocr-large-handwritten)


## Uses

Here is how to use this model in PyTorch:

```python
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
import requests
# load image from the IAM database
url = 'https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg'
image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
processor = TrOCRProcessor.from_pretrained('fhswf/TrOCR_german_handwritten')
model = VisionEncoderDecoderModel.from_pretrained('fhswf/TrOCR_german_handwritten')
pixel_values = processor(images=image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
```

## Bias, Risks, and Limitations

You can use the raw model for optical character recognition (OCR) on single text-line images of german handwriting.



## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

This model was finetuned on [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting).



## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->
Levenshtein: 1.85 <br>
WER (Word Error Rate): 17.5% <br>
CER (Character Error Rate): 4.1% 




**BibTeX:**

```bibtex
@misc{li2021trocr,
      title={TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models}, 
      author={Minghao Li and Tengchao Lv and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei},
      year={2021},
      eprint={2109.10282},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```