manu commited on
Commit
072a302
1 Parent(s): 3ce8ae7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - fr
4
+ tags:
5
+ - token-classification
6
+ - fill-mask
7
+ license: mit
8
+ datasets:
9
+ - iit-cdip
10
+ ---
11
+
12
+
13
+ This model is the combined camembert-base model, with the pretrained lilt checkpoint from the paper "LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding", with the visual backbone built from the pretrained checkpoint "microsoft/dit-base".
14
+
15
+ Original repository: https://github.com/jpWang/LiLT
16
+
17
+ To use it, it is necessary to fork the modeling and configuration files from the original repository, and load the pretrained model from the corresponding classes (LiLTRobertaLikeVisionConfig, LiLTRobertaLikeVisionForRelationExtraction, LiLTRobertaLikeVisionForTokenClassification, LiLTRobertaLikeVisionModel).
18
+ They can also be preloaded with the AutoConfig/model factories as such:
19
+
20
+ ```python
21
+ from transformers import AutoModelForTokenClassification, AutoConfig, AutoModel
22
+
23
+ from path_to_custom_classes import (
24
+ LiLTRobertaLikeVisionConfig,
25
+ LiLTRobertaLikeVisionForRelationExtraction,
26
+ LiLTRobertaLikeVisionForTokenClassification,
27
+ LiLTRobertaLikeVisionModel
28
+ )
29
+
30
+
31
+ def patch_transformers():
32
+ AutoConfig.register("liltrobertalike", LiLTRobertaLikeVisionConfig)
33
+ AutoModel.register(LiLTRobertaLikeVisionConfig, LiLTRobertaLikeVisionModel)
34
+ AutoModelForTokenClassification.register(LiLTRobertaLikeVisionConfig, LiLTRobertaLikeVisionForTokenClassification)
35
+ # etc...
36
+ ```
37
+
38
+ To load the model, it is then possible to use:
39
+ ```python
40
+ # patch_transformers() must have been executed beforehand
41
+
42
+ tokenizer = AutoTokenizer.from_pretrained("camembert-base")
43
+ model = AutoModel.from_pretrained("manu/lilt-camembert-dit-base-hf")
44
+ model = AutoModelForTokenClassification.from_pretrained("manu/lilt-camembert-dit-base-hf") # to be fine-tuned on a token classification task
45
+ ```