nicholasKluge commited on
Commit
550157b
1 Parent(s): 78069cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -39,7 +39,7 @@ co2_eq_emissions:
39
 
40
  ## Model Summary
41
 
42
- Given the lack of available monolingual foundational models in non-English languages and the fact that some of the most used and downloaded models by the community are those small enough to allow individual researchers and hobbyists to use them in low-resource environments, we developed the TeenyTinyLlama: _a series of small foundational models trained on Portuguese._
43
 
44
  TeenyTinyLlama is a compact language model based on the Llama 2 architecture ([TinyLlama implementation](https://huggingface.co/TinyLlama)). This model is designed to deliver efficient natural language processing capabilities while being resource-conscious.
45
 
@@ -92,7 +92,7 @@ These are the main arguments used in the training of this model:
92
  | adam epsilon | 0.00000001 |
93
  | weight decay | 0.01 |
94
  | scheduler type | "cosine" |
95
- | warmup steps | 50000 |
96
  | gradient checkpointing | false |
97
  | seed | 42 |
98
  | mixed precision | 'no' |
 
39
 
40
  ## Model Summary
41
 
42
+ Given the lack of available monolingual foundational models in non-English languages and the fact that some of the most used and downloaded models by the community are those small enough to allow individual researchers and hobbyists to use them in low-resource environments, we developed the TeenyTinyLlama: _a series of small foundational models trained in Portuguese language._
43
 
44
  TeenyTinyLlama is a compact language model based on the Llama 2 architecture ([TinyLlama implementation](https://huggingface.co/TinyLlama)). This model is designed to deliver efficient natural language processing capabilities while being resource-conscious.
45
 
 
92
  | adam epsilon | 0.00000001 |
93
  | weight decay | 0.01 |
94
  | scheduler type | "cosine" |
95
+ | warmup steps | 5000 |
96
  | gradient checkpointing | false |
97
  | seed | 42 |
98
  | mixed precision | 'no' |