metadata
pipeline_tag: token-classification
tags:
- named-entity-recognition
- sequence-tagger-model
widget:
- text: Numele meu este Amadeus Wolfgang și locuiesc în Berlin
inference:
parameters:
aggregation_strategy: simple
grouped_entities: true
language:
- ro
xlm-roberta model trained on ronec dataset, performing 95 f1-Macro on test set.
Test metric | Results |
---|---|
test_f1_mac_ronec | 0.9547659158706665 |
test_loss_ronec | 0.16371206939220428 |
test_prec_mac_ronec | 0.8663718700408936 |
test_rec_mac_ronec | 0.8695588111877441 |
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("EvanD/xlm-roberta-base-romanian-ner-ronec")
ner_model = AutoModelForTokenClassification.from_pretrained("EvanD/xlm-roberta-base-romanian-ner-ronec")
nlp = pipeline("ner", model=ner_model, tokenizer=tokenizer, aggregation_strategy="simple")
example = "Numele meu este Amadeus Wolfgang și locuiesc în Berlin"
ner_results = nlp(example)
print(ner_results)
# [
# {
# 'entity_group': 'PER',
# 'score': 0.9966806,
# 'word': 'Amadeus Wolfgang',
# 'start': 16,
# 'end': 32
# },
# {'entity_group': 'GPE',
# 'score': 0.99694663,
# 'word': 'Berlin',
# 'start': 48,
# 'end': 54
# }
# ]