Update README.md
Browse files
README.md
CHANGED
@@ -23,7 +23,7 @@ Legal_BERTimbau Large is a fine-tuned BERT model based on [BERTimbau](https://hu
|
|
23 |
|
24 |
For further information or requests, please go to [BERTimbau repository](https://github.com/neuralmind-ai/portuguese-bert/)."
|
25 |
|
26 |
-
The performance of Language Models can change drastically when there is a domain shift between training and test data. In order create a Portuguese Language Model adapted to a Legal domain, the original BERTimbau model was submitted to a fine-tuning stage where it was performed 1 "PreTraining" epoch over 200000 cleaned documents (lr:
|
27 |
|
28 |
|
29 |
## Available models
|
@@ -38,7 +38,7 @@ The performance of Language Models can change drastically when there is a domain
|
|
38 |
```python
|
39 |
from transformers import AutoTokenizer, AutoModelForMaskedLM
|
40 |
|
41 |
-
tokenizer = AutoTokenizer.from_pretrained("rufimelo/Legal-BERTimbau-large-TSDAE")
|
42 |
|
43 |
model = AutoModelForMaskedLM.from_pretrained("rufimelo/Legal-BERTimbau-large-TSDAE")
|
44 |
```
|
@@ -49,8 +49,8 @@ model = AutoModelForMaskedLM.from_pretrained("rufimelo/Legal-BERTimbau-large-TSD
|
|
49 |
from transformers import pipeline
|
50 |
from transformers import AutoTokenizer, AutoModelForMaskedLM
|
51 |
|
52 |
-
tokenizer = AutoTokenizer.from_pretrained("rufimelo/Legal-BERTimbau-large-TSDAE")
|
53 |
-
model = AutoModelForMaskedLM.from_pretrained("rufimelo/Legal-BERTimbau-large-TSDAE")
|
54 |
|
55 |
pipe = pipeline('fill-mask', model=model, tokenizer=tokenizer)
|
56 |
pipe('O advogado apresentou [MASK] para o juíz')
|
|
|
23 |
|
24 |
For further information or requests, please go to [BERTimbau repository](https://github.com/neuralmind-ai/portuguese-bert/)."
|
25 |
|
26 |
+
The performance of Language Models can change drastically when there is a domain shift between training and test data. In order create a Portuguese Language Model adapted to a Legal domain, the original BERTimbau model was submitted to a fine-tuning stage where it was performed 1 "PreTraining" epoch over 200000 cleaned documents (lr: 1e-5, using TSDAE technique)
|
27 |
|
28 |
|
29 |
## Available models
|
|
|
38 |
```python
|
39 |
from transformers import AutoTokenizer, AutoModelForMaskedLM
|
40 |
|
41 |
+
tokenizer = AutoTokenizer.from_pretrained("rufimelo/Legal-BERTimbau-large-TSDAE-v3")
|
42 |
|
43 |
model = AutoModelForMaskedLM.from_pretrained("rufimelo/Legal-BERTimbau-large-TSDAE")
|
44 |
```
|
|
|
49 |
from transformers import pipeline
|
50 |
from transformers import AutoTokenizer, AutoModelForMaskedLM
|
51 |
|
52 |
+
tokenizer = AutoTokenizer.from_pretrained("rufimelo/Legal-BERTimbau-large-TSDAE-v3")
|
53 |
+
model = AutoModelForMaskedLM.from_pretrained("rufimelo/Legal-BERTimbau-large-TSDAE-v3")
|
54 |
|
55 |
pipe = pipeline('fill-mask', model=model, tokenizer=tokenizer)
|
56 |
pipe('O advogado apresentou [MASK] para o juíz')
|