File size: 1,911 Bytes
640046f 07c16b9 640046f 07c16b9 640046f 07c16b9 640046f c34c866 640046f ccfe555 640046f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
language: fr
datasets:
- nlpso/m1_fine_tuning_ocr_cmbert_iob2
tag: token-classification
widget:
- text: 'Duflot, loueur de carrosses, r. de Paradis-
505
Poissonnière, 22.'
example_title: 'Noisy entry #1'
- text: 'Duſour el Besnard, march, de bois à bruler,
quai de la Tournelle, 17. etr. des Fossés-
SBernard. 11.
Dí'
example_title: 'Noisy entry #2'
- text: 'Dufour (Charles), épicier, r. St-Denis
☞
332'
example_title: 'Ground-truth entry #1'
---
# m1_ind_layers_ocr_cmbert_iob2_level_2
## Introduction
This model is a model that was fine-tuned from [Jean-Baptiste/camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner) for **nested NER task** on a nested NER Paris trade directories dataset.
## Dataset
Abbreviation|Entity group (level)|Description
-|-|-
O |1 & 2|Outside of a named entity
PER |1|Person or company name
ACT |1 & 2|Person or company professional activity
TITREH |2|Military or civil distinction
DESC |1|Entry full description
TITREP |2|Professionnal reward
SPAT |1|Address
LOC |2|Street name
CARDINAL |2|Street number
FT |2|Geographical feature
## Experiment parameter
* Pretrained-model : [Jean-Baptiste/camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner)
* Dataset : noisy (Pero OCR)
* Tagging format : IOB2
* Recognised entities : level 2
## Load model from the Hugging Face
**Warning 1 ** : this model only recognises level-2 entities of dataset. It has to be used with [m1_ind_layers_ocr_cmbert_iob2_level_1](https://huggingface.co/nlpso/m1_ind_layers_ocr_cmbert_iob2_level_1) to recognise nested entities level-1.
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("nlpso/m1_ind_layers_ocr_cmbert_iob2_level_2")
model = AutoModelForTokenClassification.from_pretrained("nlpso/m1_ind_layers_ocr_cmbert_iob2_level_2")
|