nlpso commited on
Commit
f08fd41
1 Parent(s): 6625c19

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: fr
3
+ tag: token-classification
4
+ widget:
5
+ - text: 'Duflot, loueur de carrosses, r. de Paradis-
 505
 Poissonnière, 22.'
6
+ example_title: 'Noisy entry #1'
7
+ - text: 'Duſour el Besnard, march, de bois à bruler,
 quai de la Tournelle, 17. etr. des Fossés-
 SBernard. 11.
 Dí'
8
+ example_title: 'Noisy entry #2'
9
+ - text: 'Dufour (Charles), épicier, r. St-Denis
 ☞
 332'
10
+ example_title: 'Ground-truth entry #1'
11
+ ---
12
+
13
+ # m1_ind_layers_ocr_ptrn_cmbert_io_level_2
14
+
15
+ ## Introduction
16
+
17
+ This model is a model that was fine-tuned from [HueyNemud/das22-10-camembert_pretrained](https://huggingface.co/nlpso/HueyNemud/das22-10-camembert_pretrained) for **nested NER task** on a nested NER Paris trade directories dataset.
18
+
19
+ ## Dataset
20
+
21
+ Abbreviation|Entity group (level)|Description
22
+ -|-|-
23
+ O |1 & 2|Outside of a named entity
24
+ PER |1|Person or company name
25
+ ACT |1 & 2|Person or company professional activity
26
+ TITREH |2|Military or civil distinction
27
+ DESC |1|Entry full description
28
+ TITREP |2|Professionnal reward
29
+ SPAT |1|Address
30
+ LOC |2|Street name
31
+ CARDINAL |2|Street number
32
+ FT |2|Geographical feature
33
+
34
+ ## Experiment parameter
35
+
36
+ * Pretrained-model : [HueyNemud/das22-10-camembert_pretrained](https://huggingface.co/nlpso/HueyNemud/das22-10-camembert_pretrained)
37
+ * Dataset : noisy (Pero OCR)
38
+ * Tagging format : IO
39
+ * Recognised entities : level 2
40
+
41
+ ## Load model from the Hugging Face
42
+
43
+ **Warning** : this model only recognises level-2 entities of dataset. It has to be used with [m1_ind_layers_ocr_ptrn_cmbert_io_level_1](https://huggingface.co/nlpso/m1_ind_layers_ocr_ptrn_cmbert_io_level_1) to recognise nested entities level-1.
44
+
45
+ ```python
46
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
47
+
48
+ tokenizer = AutoTokenizer.from_pretrained("m1_ind_layers_ocr_ptrn_cmbert_io_level_2")
49
+ model = AutoModelForTokenClassification.from_pretrained("m1_ind_layers_ocr_ptrn_cmbert_io_level_2")
50
+
51
+