File size: 2,052 Bytes
f2b05ac
 
 
 
 
ffa3509
90dbc5b
 
f2b05ac
 
 
 
 
 
 
0c1a4b9
f2b05ac
7965c30
f2b05ac
6a2214c
 
7965c30
 
 
 
fbb9037
7965c30
 
 
 
 
 
3e373be
7965c30
 
3e373be
7965c30
 
 
 
cd8713a
 
7965c30
f2b05ac
 
 
7965c30
f2b05ac
 
 
0c1a4b9
f2b05ac
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
license: cc-by-nc-sa-4.0
tags:
- generated_from_trainer
datasets:
- nielsr/XFUN
inference: false
base_model: microsoft/layoutxlm-base
model-index:
- name: layoutxlm-finetuned-xfund-fr
  results: []
---

# layoutxlm-finetuned-xfund-fr

This model is a fine-tuned version of [microsoft/layoutxlm-base](https://huggingface.co/microsoft/layoutxlm-base) on the [XFUND](https://github.com/doc-analysis/XFUND) dataset (French split).

## Model usage

Note that this model requires Tesseract, French package, in order to perform inference. You can install it using `!sudo apt-get install tesseract-ocr-fra`.

Here's how to use this model:

```
from transformers import AutoProcessor, AutoModelForTokenClassification
import torch
from PIL import Image

processor = AutoProcessor.from_pretrained("nielsr/layoutxlm-finetuned-xfund-fr")
model = AutoModelForTokenClassification.from_pretrained(nielsr/layoutxlm-finetuned-xfund-fr")

# assuming you have a French document, turned into an image
image = Image.open("...").convert("RGB")

# prepare for the model
encoding = processor(image, padding="max_length", max_length=512, truncation=True, return_tensors="pt")

with torch.no_grad():
  outputs = model(**encoding)
  logits = outputs.logits
  
predictions = logits.argmax(-1)
```

## Intended uses & limitations

This model can be used for NER on French scanned documents. It can recognize 4 categories: "question", "answer", "header" and "other".

## Training and evaluation data

This checkpoint used the French portion of the multilingual [XFUND](https://github.com/doc-analysis/XFUND) dataset.

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- training_steps: 1000

### Training results



### Framework versions

- Transformers 4.22.1
- Pytorch 1.10.0+cu111
- Datasets 2.4.0
- Tokenizers 0.12.1