File size: 1,445 Bytes
4c538f0
bb6db5a
 
 
ab70847
 
bb6db5a
ab70847
 
4c538f0
bb6db5a
 
 
dcb96ee
bb6db5a
24b12d6
bb6db5a
c65c66e
dcb96ee
bb6db5a
 
ab70847
bb6db5a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---
language:
- pt
- gl
- multilingual
license: gpl-3.0
widget:
- text: A minha amiga Rosa, de S�o Paulo, estudou en Montreal. Agora trabalha em Santiago
    de Compostela com o M�rio.
---

# Named Entity Recognition (NER) model for Portuguese

This is a NER model for Portuguese which uses the standard 'enamex' classes: LOC (geographical locations); PER (people); ORG (organizations); MISC (other entities).

The model is based on [BERTimbau Base](https://huggingface.co/neuralmind/bert-base-portuguese-cased), which has been fine-tuned using a combination of available corpora (see [1] for details).

There is an alternative model trained using [BERTimbau Large](https://huggingface.co/neuralmind/bert-large-portuguese-cased): [bert-large-pt-ner-enamex](https://huggingface.co/marcosgg/bert-large-pt-ner-enamex).

It was trained with a batch size of 8 and a learning rate of 2e-5 during 3 epochs. It achieved the following results on the test set (Precision/Recall/F1): 0.913/0.918/0.915.

[1] Pablo Gamallo, Marcos Garcia & Patricia Mart�n-Rodilla, 2019. [NER and open information extraction for Portuguese notebook for IberLEF 2019 Portuguese named entity recognition and relation extraction tasks](https://ceur-ws.org/Vol-2421/NER_Portuguese_paper_6.pdf). In _Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2019)
co-located with 35th Conference of the Spanish Society for Natural Language Processing (SEPLN 2019)_: 457-467.