license: cc-by-nc-4.0
language:
- fr
pipeline_tag: token-classification
widget:
- text: >-
* ALBI, (Géog.) ville de France, capitale de l'Albigeois, dans le haut
Languedoc : elle est sur le Tarn. Long. 19. 49. lat. 43. 55. 44.
- text: >-
HILPERHAUSEN, (Géog.) ville d'Allemagne en Franconie, sur la Werra, au
comté de Henneberg, entre Cobourg & Smalcalde ; elle appartient à une
branche de la maison de Saxe-Gotha. Long. 28. 15. lat. 50. 35. (D. J.)
bert-base-french-cased-edda-ner-levels
This model is designed to identify and classify Named Entity Recognition with the prefix IOB2. It has been trained on the French Encyclopédie ou dictionnaire raisonné des sciences des arts et des métiers par une société de gens de lettres (1751-1772) edited by Diderot and d'Alembert (provided by the ARTFL Encyclopédie Project). Dataset: https://huggingface.co/datasets/GEODE/GeoEDdA
Class labels
The NER detected by this model are:
- NC-Spatial: a common noun that identifies a spatial entity (nominal spatial entity) including natural features, e.g.
ville
,la rivière
,royaume
. - NP-Spatial: a proper noun identifying the name of a place (spatial named entities), e.g.
France
,Paris
,la Chine
. - ENE-Spatial: nested spatial entity , e.g.
ville de France
,royaume de Naples
,la mer Baltique
. - Relation: spatial relation, e.g.
dans
,sur
,à 10 lieues de
. - Latlong: geographic coordinates, e.g. Long. 19. 49. lat. 43. 55. 44.
- NC-Person: a common noun that identifies a person (nominal spatial entity), e.g.
roi
,l'empereur
,les auteurs
. - NP-Person: a proper noun identifying the name of a person (person named entities), e.g.
Louis XIV
,Pline
,les Romains
. - ENE-Person: nested people entity, e.g.
le czar Pierre
,roi de Macédoine
- NP-Misc: a proper noun identifying entities not classified as spatial or person, e.g.
l'Eglise
,1702
,Pélasgique
. - ENE-Misc: nested named entity not classified as spatial or person, e.g.
l'ordre de S. Jacques
,la déclaration du 21 Mars 1671
. - Head: entry name
- Domain-Mark: words indicating the knowledge domain (usually after the head and between parenthesis), e.g.
Géographie
,Geog.
,en Anatomie
.
Bias, Risks, and Limitations
This model was trained entirely on French encyclopedic entries and will likely not perform well on text in other languages or other corpora.
Acknowledgement
The authors are grateful to the ASLAN project (ANR-10-LABX-0081) of the Université de Lyon, for its financial support within the French program "Investments for the Future" operated by the National Research Agency (ANR). Data courtesy the ARTFL Encyclopédie Project, University of Chicago.