nlpso's picture
Upload README.md with huggingface_hub
9227b5d
metadata
language: fr
datasets:
  - nlpso/m2m3_fine_tuning_ocr_ptrn_cmbert_io
tag: token-classification
widget:
  - text: "Duflot, loueur de carrosses, r. de Paradis-\P    505\P    Poissonnière, 22."
    example_title: 'Noisy entry #1'
  - text: "Duſour el Besnard, march, de bois à bruler,\P    quai de la Tournelle, 17. etr. des Fossés-\P    SBernard. 11.\P    Dí"
    example_title: 'Noisy entry #2'
  - text: "Dufour (Charles), épicier, r. St-Denis\P    ☞\P    332"
    example_title: 'Ground-truth entry #1'

m3_hierarchical_ner_ocr_ptrn_cmbert_io

Introduction

This model is a fine-tuned verion from HueyNemud/das22-10-camembert_pretrained for nested NER task on a nested NER Paris trade directories dataset.

Dataset

Abbreviation Entity group (level) Description
O 1 & 2 Outside of a named entity
PER 1 Person or company name
ACT 1 & 2 Person or company professional activity
TITREH 2 Military or civil distinction
DESC 1 Entry full description
TITREP 2 Professionnal reward
SPAT 1 Address
LOC 2 Street name
CARDINAL 2 Street number
FT 2 Geographical feature

Experiment parameter

Load model from the Hugging Face

from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("nlpso/m3_hierarchical_ner_ocr_ptrn_cmbert_io")
model = AutoModelForTokenClassification.from_pretrained("nlpso/m3_hierarchical_ner_ocr_ptrn_cmbert_io")