nicholasKluge
commited on
Commit
•
084b7c1
1
Parent(s):
428d553
Update README.md
Browse files
README.md
CHANGED
@@ -122,26 +122,26 @@ trainer.train()
|
|
122 |
|
123 |
## Fine-Tuning Comparisons
|
124 |
|
125 |
-
|
126 |
-
|
127 |
-
|
|
128 |
-
|
129 |
-
|
|
130 |
-
|
|
131 |
-
|
|
|
|
|
|
|
|
132 |
|
133 |
## Cite as 🤗
|
134 |
|
135 |
```latex
|
136 |
|
137 |
-
@misc{
|
138 |
-
|
139 |
-
|
140 |
-
|
141 |
-
|
142 |
-
year = {2023},
|
143 |
-
publisher = {HuggingFace},
|
144 |
-
journal = {HuggingFace repository},
|
145 |
}
|
146 |
|
147 |
```
|
|
|
122 |
|
123 |
## Fine-Tuning Comparisons
|
124 |
|
125 |
+
To further evaluate the downstream capabilities of our models, we decided to employ a basic fine-tuning procedure for our TTL pair on a subset of tasks from the Poeta benchmark. We apply the same procedure for comparison purposes on both [BERTimbau](https://huggingface.co/neuralmind/bert-base-portuguese-cased) models, given that they are also LLM trained from scratch in Brazilian Portuguese and have a similar size range to our models. We used these comparisons to assess if our pre-training runs produced LLM capable of producing good results ("good" here means "close to BERTimbau") when utilized for downstream applications.
|
126 |
+
|
127 |
+
| Models | IMDB | FaQuAD-NLI | HateBr | Assin2 | AgNews | Average |
|
128 |
+
|-----------------|-----------|------------|-----------|-----------|-----------|---------|
|
129 |
+
| BERTimbau-large | **93.58** | 92.26 | 91.57 | **88.97** | 94.11 | 92.10 |
|
130 |
+
| BERTimbau-small | 92.22 | **93.07** | 91.28 | 87.45 | 94.19 | 91.64 |
|
131 |
+
| **TTL-460m** | 91.64 | 91.18 | **92.28** | 86.43 | **94.42** | 91.19 |
|
132 |
+
| **TTL-160m** | 91.14 | 90.00 | 90.71 | 85.78 | 94.05 | 90.34 |
|
133 |
+
|
134 |
+
All the shown results are the higher accuracy scores achieved on the respective task test sets after fine-tuning the models on the training sets. All fine-tuning runs used the same hyperparameters, and the code implementation can be found in the [model cards](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m-HateBR) of our fine-tuned models.
|
135 |
|
136 |
## Cite as 🤗
|
137 |
|
138 |
```latex
|
139 |
|
140 |
+
@misc{correa24ttllama,
|
141 |
+
title = {TeenyTinyLlama: a pair of open-source tiny language models trained in Brazilian Portuguese},
|
142 |
+
author = {Corr{\^e}a, Nicholas Kluge and Falk, Sophia and Fatimah, Shiza and Sen, Aniket and De Oliveira, Nythamar},
|
143 |
+
journal={arXiv},
|
144 |
+
year = {2024},
|
|
|
|
|
|
|
145 |
}
|
146 |
|
147 |
```
|