3chez commited on
Commit
5524e07
1 Parent(s): d814503

End of training

Browse files
Files changed (1) hide show
  1. README.md +14 -35
README.md CHANGED
@@ -1,54 +1,33 @@
1
  ---
2
  license: cc-by-nc-sa-4.0
 
3
  tags:
4
  - generated_from_trainer
5
- base_model: microsoft/layoutxlm-base
6
  datasets:
7
- - nielsr/XFUN
8
- inference: false
9
  model-index:
10
  - name: layoutxlm-finetuned-xfund-fr
11
  results: []
12
  ---
13
 
14
- # layoutxlm-finetuned-xfund-fr
15
-
16
- This model is a fine-tuned version of [microsoft/layoutxlm-base](https://huggingface.co/microsoft/layoutxlm-base) on the [XFUND](https://github.com/doc-analysis/XFUND) dataset (French split).
17
-
18
- ## Model usage
19
 
20
- Note that this model requires Tesseract, French package, in order to perform inference. You can install it using `!sudo apt-get install tesseract-ocr-fra`.
21
-
22
- Here's how to use this model:
23
-
24
- ```
25
- from transformers import AutoProcessor, AutoModelForTokenClassification
26
- import torch
27
- from PIL import Image
28
-
29
- processor = AutoProcessor.from_pretrained("nielsr/layoutxlm-finetuned-xfund-fr")
30
- model = AutoModelForTokenClassification.from_pretrained(nielsr/layoutxlm-finetuned-xfund-fr")
31
 
32
- # assuming you have a French document, turned into an image
33
- image = Image.open("...").convert("RGB")
34
 
35
- # prepare for the model
36
- encoding = processor(image, padding="max_length", max_length=512, truncation=True, return_tensors="pt")
37
 
38
- with torch.no_grad():
39
- outputs = model(**encoding)
40
- logits = outputs.logits
41
-
42
- predictions = logits.argmax(-1)
43
- ```
44
 
45
  ## Intended uses & limitations
46
 
47
- This model can be used for NER on French scanned documents. It can recognize 4 categories: "question", "answer", "header" and "other".
48
 
49
  ## Training and evaluation data
50
 
51
- This checkpoint used the French portion of the multilingual [XFUND](https://github.com/doc-analysis/XFUND) dataset.
52
 
53
  ## Training procedure
54
 
@@ -70,7 +49,7 @@ The following hyperparameters were used during training:
70
 
71
  ### Framework versions
72
 
73
- - Transformers 4.22.1
74
- - Pytorch 1.10.0+cu111
75
- - Datasets 2.4.0
76
- - Tokenizers 0.12.1
 
1
  ---
2
  license: cc-by-nc-sa-4.0
3
+ base_model: microsoft/layoutxlm-base
4
  tags:
5
  - generated_from_trainer
 
6
  datasets:
7
+ - xfun
 
8
  model-index:
9
  - name: layoutxlm-finetuned-xfund-fr
10
  results: []
11
  ---
12
 
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
 
 
 
15
 
16
+ # layoutxlm-finetuned-xfund-fr
 
 
 
 
 
 
 
 
 
 
17
 
18
+ This model is a fine-tuned version of [microsoft/layoutxlm-base](https://huggingface.co/microsoft/layoutxlm-base) on the xfun dataset.
 
19
 
20
+ ## Model description
 
21
 
22
+ More information needed
 
 
 
 
 
23
 
24
  ## Intended uses & limitations
25
 
26
+ More information needed
27
 
28
  ## Training and evaluation data
29
 
30
+ More information needed
31
 
32
  ## Training procedure
33
 
 
49
 
50
  ### Framework versions
51
 
52
+ - Transformers 4.40.2
53
+ - Pytorch 2.2.1+cu121
54
+ - Datasets 2.19.1
55
+ - Tokenizers 0.19.1