Token Classification
GLiNER
PyTorch
multilingual
Edit model card

About

This is a GLiNER model finetuned on medieval Latin. It was trained to improve the identification of PERSON and LOC. It was finetuned from urchade/gliner_multi-v2.1. The model was finetuned on 1,500 annotations from the Home Alcar sentences. Only 1,500 were selected to prevent catastrophic forgetting.

GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios.

Installation

To use this model, you must install the GLiNER Python library:

!pip install gliner

Usage

Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using GLiNER.from_pretrained and predict entities with predict_entities.

from gliner import GLiNER

model = GLiNER.from_pretrained("medieval-data/gliner_multi-v2.1-medieval-latin")

text = """
Testes : magister Stephanus cantor Autissiodorensis , Petrus capellanus comitis , Gaufridus clericus , Hugo de Argenteolo , Milo Filluns , Johannes Maleherbe , Nivardus de Argenteolo , Columbus tunc prepositus Tornodorensis , Johannes prepositus Autissiodorensis , Johannes Brisebarra .
"""

labels = ["PERSON", "LOC"]

entities = model.predict_entities(text, labels)

for entity in entities:
    print(entity["text"], "=>", entity["label"])
Stephanus => PERSON
Autissiodorensis => LOC
Petrus => PERSON
Gaufridus => PERSON
Hugo de Argenteolo => PERSON
Milo Filluns => PERSON
Johannes Maleherbe => PERSON
Nivardus de Argenteolo => PERSON
Columbus => PERSON
Tornodorensis => LOC
Johannes => PERSON
Autissiodorensis => LOC
Johannes Brisebarra => PERSON

Citation to Original GLiNER Model

@misc{zaratiana2023gliner,
      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, 
      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
      year={2023},
      eprint={2311.08526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Downloads last month
42
Inference API (serverless) does not yet support gliner models for this pipeline type.

Dataset used to train medieval-data/gliner_multi-v2.1-medieval-latin