nicholasKluge
commited on
Commit
•
550157b
1
Parent(s):
78069cc
Update README.md
Browse files
README.md
CHANGED
@@ -39,7 +39,7 @@ co2_eq_emissions:
|
|
39 |
|
40 |
## Model Summary
|
41 |
|
42 |
-
Given the lack of available monolingual foundational models in non-English languages and the fact that some of the most used and downloaded models by the community are those small enough to allow individual researchers and hobbyists to use them in low-resource environments, we developed the TeenyTinyLlama: _a series of small foundational models trained
|
43 |
|
44 |
TeenyTinyLlama is a compact language model based on the Llama 2 architecture ([TinyLlama implementation](https://huggingface.co/TinyLlama)). This model is designed to deliver efficient natural language processing capabilities while being resource-conscious.
|
45 |
|
@@ -92,7 +92,7 @@ These are the main arguments used in the training of this model:
|
|
92 |
| adam epsilon | 0.00000001 |
|
93 |
| weight decay | 0.01 |
|
94 |
| scheduler type | "cosine" |
|
95 |
-
| warmup steps |
|
96 |
| gradient checkpointing | false |
|
97 |
| seed | 42 |
|
98 |
| mixed precision | 'no' |
|
|
|
39 |
|
40 |
## Model Summary
|
41 |
|
42 |
+
Given the lack of available monolingual foundational models in non-English languages and the fact that some of the most used and downloaded models by the community are those small enough to allow individual researchers and hobbyists to use them in low-resource environments, we developed the TeenyTinyLlama: _a series of small foundational models trained in Portuguese language._
|
43 |
|
44 |
TeenyTinyLlama is a compact language model based on the Llama 2 architecture ([TinyLlama implementation](https://huggingface.co/TinyLlama)). This model is designed to deliver efficient natural language processing capabilities while being resource-conscious.
|
45 |
|
|
|
92 |
| adam epsilon | 0.00000001 |
|
93 |
| weight decay | 0.01 |
|
94 |
| scheduler type | "cosine" |
|
95 |
+
| warmup steps | 5000 |
|
96 |
| gradient checkpointing | false |
|
97 |
| seed | 42 |
|
98 |
| mixed precision | 'no' |
|