bertimbau-large-ner-selective
This model card aims to simplify the use of the portuguese Bert, a.k.a, Bertimbau for the Named Entity Recognition task.
For this model card the we used the BERT-CRF (selective scenario, 5 classes) model available in the ner_evaluation folder of the original Bertimbau repo.
Available classes are:
- PESSOA
- ORGANIZACAO
- LOCAL
- TEMPO
- VALOR
Usage
# Load model directly
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("marquesafonso/bertimbau-large-ner-selective")
model = AutoModelForTokenClassification.from_pretrained("marquesafonso/bertimbau-large-ner-selective")
Example
from transformers import pipeline
pipe = pipeline("ner", model="marquesafonso/bertimbau-large-ner-selective", aggregation_strategy='simple')
sentence = "Acima de Ederson, abaixo de Rúben Dias. É entre os dois jogadores do Manchester City que se vai colocar Gonçalo Ramos no ranking de vendas mais avultadas do Benfica."
result = pipe([sentence])
print(f"{sentence}\n{result}")
# Acima de Ederson, abaixo de Rúben Dias. É entre os dois jogadores do Manchester City que se vai colocar Gonçalo Ramos no ranking de vendas mais avultadas do Benfica.
# [[
# {'entity_group': 'PESSOA', 'score': 0.99694395, 'word': 'Ederson', 'start': 9, 'end': 16},
# {'entity_group': 'PESSOA', 'score': 0.9918462, 'word': 'Rúben Dias', 'start': 28, 'end': 38},
# {'entity_group': 'ORGANIZACAO', 'score': 0.96376556, 'word': 'Manchester City', 'start': 69, 'end': 84},
# {'entity_group': 'PESSOA', 'score': 0.9993823, 'word': 'Gonçalo Ramos', 'start': 104, 'end': 117},
# {'entity_group': 'ORGANIZACAO', 'score': 0.9033079, 'word': 'Benfica', 'start': 157, 'end': 164}
# ]]
Acknowledgements
This work is an adaptation of portuguese Bert, a.k.a, Bertimbau. You may check and/or cite their work:
@InProceedings{souza2020bertimbau,
author="Souza, F{\'a}bio and Nogueira, Rodrigo and Lotufo, Roberto",
editor="Cerri, Ricardo and Prati, Ronaldo C.",
title="BERTimbau: Pretrained BERT Models for Brazilian Portuguese",
booktitle="Intelligent Systems",
year="2020",
publisher="Springer International Publishing",
address="Cham",
pages="403--417",
isbn="978-3-030-61377-8"
}
@article{souza2019portuguese,
title={Portuguese Named Entity Recognition using BERT-CRF},
author={Souza, F{\'a}bio and Nogueira, Rodrigo and Lotufo, Roberto},
journal={arXiv preprint arXiv:1909.10649},
url={http://arxiv.org/abs/1909.10649},
year={2019}
}
Note that the authors - Fabio Capuano de Souza, Rodrigo Nogueira, Roberto de Alencar Lotufo - have used an MIT LICENSE for their work.
- Downloads last month
- 263
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.