jarodrigues
commited on
Commit
•
c467e8f
1
Parent(s):
4669967
Update README.md
Browse files
README.md
CHANGED
@@ -33,19 +33,22 @@ widget:
|
|
33 |
|
34 |
# Albertina PT-BR Base
|
35 |
|
36 |
-
|
37 |
-
**Albertina PT-*** is a foundation, large language model for the **Portuguese language**.
|
38 |
|
39 |
It is an **encoder** of the BERT family, based on the neural architecture Transformer and
|
40 |
developed over the DeBERTa model, with most competitive performance for this language.
|
41 |
-
It
|
42 |
-
namely the European variant from Portugal (**PT-PT**) and the American variant from Brazil (**PT-BR**),
|
43 |
-
and it is distributed free of charge and under a most permissible license.
|
44 |
|
|
|
|
|
|
|
|
|
|
|
45 |
|
46 |
-
|
47 |
For further details, check the respective [publication](https://arxiv.org/abs/2305.06721):
|
48 |
|
|
|
49 |
``` latex
|
50 |
@misc{albertina-pt,
|
51 |
title={Advancing Neural Encoding of Portuguese
|
@@ -119,7 +122,8 @@ We address four tasks from those in PLUE, namely:
|
|
119 |
|
120 |
| Model | RTE (Accuracy) | WNLI (Accuracy)| MRPC (F1) | STS-B (Pearson) |
|
121 |
|--------------------------|----------------|----------------|-----------|-----------------|
|
122 |
-
| **Albertina-PT-BR
|
|
|
123 |
|
124 |
|
125 |
<br>
|
|
|
33 |
|
34 |
# Albertina PT-BR Base
|
35 |
|
36 |
+
**Albertina PT-BR Base** is a foundation, large language model for American **Portuguese** from **Brazil**.
|
|
|
37 |
|
38 |
It is an **encoder** of the BERT family, based on the neural architecture Transformer and
|
39 |
developed over the DeBERTa model, with most competitive performance for this language.
|
40 |
+
It is distributed free of charge and under a most permissible license.
|
|
|
|
|
41 |
|
42 |
+
You may be also interested in [**Albertina PT-BR**](https://huggingface.co/PORTULAN/albertina-ptbr) and in [**Albertina PT-BR No-brWaC**](https://huggingface.co/PORTULAN/albertina-ptbr-nobrwac).
|
43 |
+
These are larger versions,
|
44 |
+
and to the best of our knowledge, these are encoders specifically for this language and variant
|
45 |
+
that, at the time of its initial distribution, sets a new state of the art for it, and are made publicly available
|
46 |
+
and distributed for reuse.
|
47 |
|
48 |
+
**Albertina PT-BR Base** is developed by a joint team from the University of Lisbon and the University of Porto, Portugal.
|
49 |
For further details, check the respective [publication](https://arxiv.org/abs/2305.06721):
|
50 |
|
51 |
+
|
52 |
``` latex
|
53 |
@misc{albertina-pt,
|
54 |
title={Advancing Neural Encoding of Portuguese
|
|
|
122 |
|
123 |
| Model | RTE (Accuracy) | WNLI (Accuracy)| MRPC (F1) | STS-B (Pearson) |
|
124 |
|--------------------------|----------------|----------------|-----------|-----------------|
|
125 |
+
| **Albertina-PT-BR** | **0.7545** | 0.4601 | **0.9071 | **0.8910** |
|
126 |
+
| **Albertina-PT-BR Base** | 0.6462 | **0.5493** | 0.8779 | 0.8501 |
|
127 |
|
128 |
|
129 |
<br>
|