File size: 1,488 Bytes
62d3824 138d1e6 62d3824 138d1e6 62d3824 6746df8 62d3824 60a5d86 62d3824 138d1e6 60a5d86 8b7ccdb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
---
language: fr
tag: token-classification
widget:
- text: 'Duflot, loueur de carrosses, r. de Paradis-
505
Poissonnière, 22.'
example_title: 'Noisy entry #1'
- text: 'Duſour el Besnard, march, de bois à bruler,
quai de la Tournelle, 17. etr. des Fossés-
SBernard. 11.
Dí'
example_title: 'Noisy entry #2'
- text: 'Dufour (Charles), épicier, r. St-Denis
☞
332'
example_title: 'Ground-truth entry #1'
---
# m0_flat_ner_ocr_cmbert_io
## Introduction
This model is a fine-tuned verion from [HueyNemud/das22-10-camembert_pretrained](https://huggingface.co/HueyNemud/das22-10-camembert_pretrained) for **nested NER task** on a nested NER Paris trade directories dataset.
## Dataset
Abbreviation|Description
-|-
O |Outside of a named entity
PER |Person or company name
ACT |Person or company professional activity
TITRE |Distinction
LOC |Street name
CARDINAL |Street number
FT |Geographical feature
## Experiment parameter
* Pretrained-model : [HueyNemud/das22-10-camembert_pretrained](https://huggingface.co/HueyNemud/das22-10-camembert_pretrained)
* Dataset : noisy (Pero OCR)
* Tagging format : IO
* Recognised entities : All (flat entities)
## Load model from the HuggingFace
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("nlpso/m0_flat_ner_ocr_cmbert_io")
model = AutoModelForTokenClassification.from_pretrained("nlpso/m0_flat_ner_ocr_cmbert_io")
|