--- license: mit language: - pt metrics: - name: Precision type: Precision value: 0.783 - name: Recall type: Recall value: 0.774 - name: F1-Score type: F1-Score value: 0.779 library_name: transformers pipeline_tag: token-classification tags: - BERT - CRF - NER - Portuguese - Literature --- # LitBERT-CRF LitBERT-CRF model is a fine-tuned BERT-CRF architecture specifically designed for Named Entity Recognition (NER) in Portuguese-written literature. ## Model Details ### Model Description LitBERT-CRF leverages a BERT-CRF architecture, initially pre-trained on the brWaC corpus and fine-tuned on the HAREM dataset for enhanced NER performance in Portuguese. It incorporates domain-specific literary data through Masked Language Modeling (MLM), making it well-suited for identifying named entities in literary texts. - **Model type:** BERT-CRF for NER - **Language:** Portuguese - **Fine-tuned from model:** BERT-CRF on brWaC and HAREM ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data PPORTAL_ner dataset #### Metrics - **Precision**: 0.783 - **Recall**: 0.774 - **F1-score**: 0.779 ## Citation **BibTeX:** ``` @inproceedings{silva-moro-2024-evaluating, title = "Evaluating Pre-training Strategies for Literary Named Entity Recognition in {P}ortuguese", author = "Silva, Mariana O. and Moro, Mirella M.", editor = "Gamallo, Pablo and Claro, Daniela and Teixeira, Ant{\'o}nio and Real, Livy and Garcia, Marcos and Oliveira, Hugo Gon{\c{c}}alo and Amaro, Raquel", booktitle = "Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1", month = mar, year = "2024", address = "Santiago de Compostela, Galicia/Spain", publisher = "Association for Computational Lingustics", url = "https://aclanthology.org/2024.propor-1.39", pages = "384--393", } ``` **APA:** Mariana O. Silva and Mirella M. Moro. 2024. Evaluating Pre-training Strategies for Literary Named Entity Recognition in Portuguese. In Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1, pages 384–393, Santiago de Compostela, Galicia/Spain. Association for Computational Lingustics.