Historical Entities Models
Collection
4 items • Updated
How to use emanuelaboros/globalise-entity-linker with Transformers:
# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("emanuelaboros/globalise-entity-linker")
model = AutoModelForSeq2SeqLM.from_pretrained("emanuelaboros/globalise-entity-linker")emanuelaboros/historic-nel
The model is based on mGENRE (multilingual Generative ENtity REtrieval) proposed by De Cao et al, a sequence-to-sequence architecture for entity disambiguation based on mBART. It uses constrained generation to output entity names mapped to Wikidata/QIDs.
Entity linking model for historical VOC/GLOBALISE archival data, linking named entities from early modern Dutch colonial sources to curated GLOBALISE/Dataverse authority records. This model was fine-tuned on the HIPE-2022 dataset.
from transformers import AutoTokenizer, pipeline
NEL_MODEL_NAME = "emanuelaboros/globalise-entity-linker"
nel_tokenizer = AutoTokenizer.from_pretrained(NEL_MODEL_NAME)
nel_pipeline = pipeline("generic-nel", model=NEL_MODEL_NAME,
tokenizer=nel_tokenizer,
trust_remote_code=True,
device='cpu')
sentence = "Le 0ctobre 1894, [START] Dreyfvs [END] est arrêté à Paris, accusé d'espionnage pour l'Allemagne — un événement qui déch1ra la société fr4nçaise pendant des années."
print(nel_pipeline(sentence))