|
--- |
|
language: fr |
|
datasets: |
|
- nlpso/m1_fine_tuning_ref_cmbert_io |
|
tag: token-classification |
|
widget: |
|
- text: 'Duflot, loueur de carrosses, r. de Paradis-
505
Poissonnière, 22.' |
|
example_title: 'Noisy entry #1' |
|
- text: 'Duſour el Besnard, march, de bois à bruler,
quai de la Tournelle, 17. etr. des Fossés-
SBernard. 11.
Dí' |
|
example_title: 'Noisy entry #2' |
|
- text: 'Dufour (Charles), épicier, r. St-Denis
☞
332' |
|
example_title: 'Ground-truth entry #1' |
|
--- |
|
|
|
# m1_ind_layers_ref_cmbert_io_level_2 |
|
|
|
## Introduction |
|
|
|
This model is a model that was fine-tuned from [Jean-Baptiste/camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner) for **nested NER task** on a nested NER Paris trade directories dataset. |
|
|
|
## Dataset |
|
|
|
Abbreviation|Entity group (level)|Description |
|
-|-|- |
|
O |1 & 2|Outside of a named entity |
|
PER |1|Person or company name |
|
ACT |1 & 2|Person or company professional activity |
|
TITREH |2|Military or civil distinction |
|
DESC |1|Entry full description |
|
TITREP |2|Professionnal reward |
|
SPAT |1|Address |
|
LOC |2|Street name |
|
CARDINAL |2|Street number |
|
FT |2|Geographical feature |
|
|
|
## Experiment parameter |
|
|
|
* Pretrained-model : [Jean-Baptiste/camembert-ner](https://huggingface.co/Jean-Baptiste/camembert-ner) |
|
* Dataset : ground-truth |
|
* Tagging format : IO |
|
* Recognised entities : level 2 |
|
|
|
## Load model from the Hugging Face |
|
|
|
**Warning 1 ** : this model only recognises level-2 entities of dataset. It has to be used with [m1_ind_layers_ref_cmbert_io_level_1](https://huggingface.co/nlpso/m1_ind_layers_ref_cmbert_io_level_1) to recognise nested entities level-1. |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForTokenClassification |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("nlpso/m1_ind_layers_ref_cmbert_io_level_2") |
|
model = AutoModelForTokenClassification.from_pretrained("nlpso/m1_ind_layers_ref_cmbert_io_level_2") |
|
|
|
|