FelipeCasali-USP commited on
Commit
64cafa0
·
1 Parent(s): 61d244e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -13,7 +13,7 @@ widget:
13
  example_title: "Felipe Casali Silva, Teste, Rio de Janeiro, RJ"
14
  ---
15
 
16
- # lgpd_pii_identifier : Financial BERT PT BR
17
 
18
  lgpd_pii_identifier is a pre-trained NLP model to identify sensitive data in the scope of LGPD (Lei Geral de Proteção de Dados)
19
 
@@ -32,7 +32,7 @@ data according to their businness needs, and governance rules.
32
  In order to use the model, you need to get the HuggingFace auth token. You can get it [here](https://huggingface.co/settings/token).
33
 
34
  ```python
35
- from transformers import AutoTokenizer, BertForSequenceClassification
36
  import numpy as np
37
 
38
  pred_mapper = {
@@ -42,8 +42,8 @@ pred_mapper = {
42
  3: "estado"
43
  }
44
 
45
- tokenizer = AutoTokenizer.from_pretrained("FelipeCasali-USP/lgpd_pii_identifier")
46
- lgpd_pii_identifier = BertForSequenceClassification.from_pretrained("FelipeCasali-USP/lgpd_pii_identifier")
47
 
48
  tokens = tokenizer(["String to be analized"], return_tensors="pt",
49
  padding=True, truncation=True, max_length=512)
 
13
  example_title: "Felipe Casali Silva, Teste, Rio de Janeiro, RJ"
14
  ---
15
 
16
+ # lgpd_pii_identifier : LGPD PII Identifier
17
 
18
  lgpd_pii_identifier is a pre-trained NLP model to identify sensitive data in the scope of LGPD (Lei Geral de Proteção de Dados)
19
 
 
32
  In order to use the model, you need to get the HuggingFace auth token. You can get it [here](https://huggingface.co/settings/token).
33
 
34
  ```python
35
+ from transformers import DistilBertModel, DistilBertTokenizer
36
  import numpy as np
37
 
38
  pred_mapper = {
 
42
  3: "estado"
43
  }
44
 
45
+ tokenizer = DistilBertTokenizer.from_pretrained("FelipeCasali-USP/lgpd_pii_identifier")
46
+ lgpd_pii_identifier = DistilBertModel.from_pretrained("FelipeCasali-USP/lgpd_pii_identifier")
47
 
48
  tokens = tokenizer(["String to be analized"], return_tensors="pt",
49
  padding=True, truncation=True, max_length=512)