Edit model card

bert-base-cased-literary-NER

A NER model trained on a literary dataset of the first chapter of 40 novels. The model supports the following NER class: PER, ORG and LOC. If you use the model in a huggingface pipeline, pass aggregation_strategy="first".

Dataset

We corrected the dataset of Dekker et al. (2019) and added LOC and ORG annotations.

Citation

If you use this model in your research, please cite:

@InProceedings{amalvy:hal-03972448,
  title	       = {{Data Augmentation for Robust Character Detection in
                  Fantasy Novels}},
  author       = {Amalvy, Arthur and Labatut, Vincent and Dufour,
                  Richard},
  url	       = {https://hal.science/hal-03972448},
  booktitle    = {{Workshop on Computational Methods in the Humanities
                  2022}},
  YEAR	       = {2022},
  hal_id       = {hal-03972448},
  hal_version  = {v1},
}

The dataset was originally published and annotated by Dekker et al (2019):

@Article{dekker-2019-evaluation_ner_social_networks_novels,
  author       = {Dekker, N. and Kuhn, T. and van Erp, M.},
  journal      = {PeerJ Computer Science},
  title        = {Evaluating named entity recognition tools for extracting social networks from novels},
  year         = {2019},
  pages        = {e189},
  volume       = {5},
  doi          = {10.7717/peerj-cs.189},
}
Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.