nlpso commited on
Commit
62d3824
1 Parent(s): f2a84c8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +45 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: fr
3
+ tag: token-classification
4
+ widget:
5
+ - text: 'Duflot, loueur de carrosses, r. de Paradis-
 505
 Poissonnière, 22.'
6
+ example_title: 'Noisy entry #1'
7
+ - text: 'Duſour el Besnard, march, de bois à bruler,
 quai de la Tournelle, 17. etr. des Fossés-
 SBernard. 11.
 Dí'
8
+ example_title: 'Noisy entry #2'
9
+ - text: 'Dufour (Charles), épicier, r. St-Denis
 ☞
 332'
10
+ example_title: 'Ground-truth entry #1'
11
+ ---
12
+
13
+ # m0_flat_ner_ocr_cmbert_io
14
+
15
+ ## Introduction
16
+
17
+ This model is a fine-tuned verion from [HueyNemud/das22-10-camembert_pretrained](https://huggingface.co/nlpso/HueyNemud/das22-10-camembert_pretrained) for **nested NER task** on a nested NER Paris trade directories dataset.
18
+
19
+ ## Dataset
20
+
21
+ Abbreviation|Description
22
+ -|-|-
23
+ O |Outside of a named entity
24
+ PER |Person or company name
25
+ ACT |Person or company professional activity
26
+ TITRE |Distinction
27
+ LOC |Street name
28
+ CARDINAL |Street number
29
+ FT |2|Geographical feature
30
+
31
+ ## Experiment parameter
32
+
33
+ * Pretrained-model : [HueyNemud/das22-10-camembert_pretrained](https://huggingface.co/nlpso/HueyNemud/das22-10-camembert_pretrained)
34
+ * Dataset : noisy (Pero OCR)
35
+ * Tagging format : IO
36
+ * Recognised entities : 'All (flat entities)'
37
+
38
+ ## Load model from the Hugging Face
39
+
40
+ ```python
41
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
42
+
43
+ tokenizer = AutoTokenizer.from_pretrained("m0_flat_ner_ocr_cmbert_io")
44
+ model = AutoModelForTokenClassification.from_pretrained("m0_flat_ner_ocr_cmbert_io")
45
+