biobertpt-all / README.md
pucpr's picture
Update README.md
72a6286
metadata
language: pt
widget:
  - text: O paciente recebeu [MASK] do hospital.
  - text: O médico receitou a medicação para controlar a [MASK].
  - text: O principal [MASK] da COVID-19 é tosse seca.
  - text: >-
      O vírus da gripe apresenta um [MASK] constituído por segmentos de ácido
      ribonucleico.
datasets:
  - biomedical literature from Scielo and Pubmed
thumbnail: >-
  https://raw.githubusercontent.com/HAILab-PUCPR/BioBERTpt/master/images/logo-biobertpr1.png
Logo BioBERTpt

BioBERTpt - Portuguese Clinical and Biomedical BERT

The BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition paper contains clinical and biomedical BERT-based models for Portuguese Language, initialized with BERT-Multilingual-Cased & trained on clinical notes and biomedical literature.

This model card describes the BioBERTpt(all) model, a full version with clinical narratives and biomedical literature in Portuguese language.

How to use the model

Load the model via the transformers library:

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("pucpr/biobertpt-all")
model = AutoModel.from_pretrained("pucpr/biobertpt-all")

More Information

Refer to the original paper, BioBERTpt - A Portuguese Neural Language Model for Clinical Named Entity Recognition for additional details and performance on Portuguese NER tasks.

Questions?

Post a Github issue on the BioBERTpt repo.