File size: 2,831 Bytes
08664be
 
96eaee8
 
 
 
08664be
 
 
 
 
 
 
 
 
 
 
 
 
0065e6f
08664be
 
0065e6f
37fdea4
08664be
37fdea4
08664be
 
 
 
37fdea4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
08664be
 
 
37fdea4
08664be
 
 
 
 
 
 
 
 
0065e6f
08664be
 
 
 
 
 
 
 
 
353c6f9
 
96eaee8
08664be
 
 
 
 
 
0065e6f
 
 
 
 
 
 
 
 
96eaee8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
library_name: transformers
datasets:
- fhswf/german_handwriting
language:
- de
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->



## Model Details


<!-- Provide a longer summary of what this model is. -->

TrOCR model fine-tuned on the [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting). It was introduced in the paper [TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models](https://arxiv.org/abs/2109.10282) by Li et al. and first released in [this repository](https://github.com/microsoft/unilm/tree/master/trocr).

- **Developed by:** [More Information Needed]
- **Model type:** Transformer OCR
- **Language(s) (NLP):** German
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [TrOCR_large_handwritten](https://huggingface.co/microsoft/trocr-large-handwritten)


## Uses

Here is how to use this model in PyTorch:

```python
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
import requests
# load image from the IAM database
url = 'https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg'
image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
processor = TrOCRProcessor.from_pretrained('TGrote11/testModel')
model = VisionEncoderDecoderModel.from_pretrained('TGrote11/testModel')
pixel_values = processor(images=image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
```

## Bias, Risks, and Limitations

You can use the raw model for optical character recognition (OCR) on single text-line images of german handwriting.



## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

This model was finetuned on [german_handwriting](https://huggingface.co/datasets/fhswf/german_handwriting).

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->


## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->
Levenshtein: 1.85 <br>
WER (Word Error Rate): 17.5% <br>
CER (Character Error Rate): 4.1% 




**BibTeX:**

```bibtex
@misc{li2021trocr,
      title={TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models}, 
      author={Minghao Li and Tengchao Lv and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei},
      year={2021},
      eprint={2109.10282},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```