--- pipeline_tag: token-classification tags: - named-entity-recognition - sequence-tagger-model widget: - text: Numele meu este Amadeus Wolfgang și locuiesc în Berlin inference: parameters: aggregation_strategy: simple grouped_entities: true language: - ro --- xlm-roberta model trained on [ronec](https://github.com/dumitrescustefan/ronec) dataset, performing 95 f1-Macro on test set. | Test metric | Results | |------------------------|--------------------------| | test_f1_mac_ronec | 0.9547659158706665 | | test_loss_ronec | 0.16371206939220428 | | test_prec_mac_ronec | 0.8663718700408936 | | test_rec_mac_ronec | 0.8695588111877441 | ```python from transformers import AutoTokenizer, AutoModelForTokenClassification from transformers import pipeline tokenizer = AutoTokenizer.from_pretrained("EvanD/xlm-roberta-base-romanian-ner-ronec") ner_model = AutoModelForTokenClassification.from_pretrained("EvanD/xlm-roberta-base-romanian-ner-ronec") nlp = pipeline("ner", model=ner_model, tokenizer=tokenizer, aggregation_strategy="simple") example = "Numele meu este Amadeus Wolfgang și locuiesc în Berlin" ner_results = nlp(example) print(ner_results) # [ # { # 'entity_group': 'PER', # 'score': 0.9966806, # 'word': 'Amadeus Wolfgang', # 'start': 16, # 'end': 32 # }, # {'entity_group': 'GPE', # 'score': 0.99694663, # 'word': 'Berlin', # 'start': 48, # 'end': 54 # } # ] ```