Tagger for literary character mentions (DROC corpus)

This is the character recognizer model that is being used in LLpro. It detects character mentions in literary fiction: (a) proper nouns ("Alice", "Effi"), and (b) nominal phrases ("Gärtner", "Mutter", "Graf", "Idiot", "Schöne", ...). The model is trained on the DROC dataset, fine-tuning the domain-adapted lkonle/fiction-gbert-large. (Training code)

F1-Score: 91.85 (on a held-out data split; micro average on B-PER and I-PER labels)


Demo Usage:

from flair.data import Sentence
from flair.models import SequenceTagger

# load tagger
tagger = SequenceTagger.load("aehrm/droc-character-recognizer")

# make example sentence
sentence = Sentence("Effi folgte Graf Instetten nach Kessin.")

# predict NER tags
tagger.predict(sentence)

# print sentence
print(sentence)
# >>> Sentence[7]: "Effi folgte Graf Instetten nach Kessin." → ["Effi"/PER, "Graf Instetten"/PER]

# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('character'):
    print(entity)
# >>> Span[0:1]: "Effi" → PER (1.0)
# >>> Span[2:4]: "Graf Instetten" → PER (1.0)

Cite:

Please cite the following paper when using this model.


@inproceedings{ehrmanntraut-et-al-llpro-2023,
    address = {Ingolstadt, Germany},
    title = {{LLpro}: A Literary Language Processing Pipeline for {German} Narrative Text},
    booktitle = {Proceedings of the 10th Conference on Natural Language Processing ({KONVENS} 2022)},
    publisher = {{KONVENS} 2023 Organizers},
    author = {Ehrmanntraut, Anton and Konle, Leonard and Jannidis, Fotis},
    year = {2023},
}
Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.