PORTULAN
/

albertina-900m-portuguese-ptpt-encoder

foundation model

Inference Endpoints

Model card Files Files and versions Community

jarodrigues commited on May 14, 2023

Commit

cb47993

•

1 Parent(s): cba1aa1

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -107,7 +107,7 @@ As codebase, we resorted to the [DeBERTa V2 XLarge](https://huggingface.co/micro
 To train **Albertina-PT-PT**, the data set was tokenized with the original DeBERTa tokenizer with a 128 token sequence truncation and dynamic padding.
 The model was trained using the maximum available memory capacity resulting in a batch size of 832 samples (52 samples per GPU and applying gradient accumulation in order to approximate the batch size of the PT-BR model).
-Similarly to the PT-BR variant above, we opted for a learning rate of 1e-5 with linear decay and 10k warm-up steps.
 However, since the number of training examples is approximately twice of that in the PT-BR variant, we reduced the number of training epochs to half and completed only 25 epochs, which resulted in approximately 245k steps.
 The model was trained for 3 days on a2-highgpu-8gb Google Cloud A2 VMs with 8 GPUs, 96 vCPUs and 680 GB of RAM.

 To train **Albertina-PT-PT**, the data set was tokenized with the original DeBERTa tokenizer with a 128 token sequence truncation and dynamic padding.
 The model was trained using the maximum available memory capacity resulting in a batch size of 832 samples (52 samples per GPU and applying gradient accumulation in order to approximate the batch size of the PT-BR model).
+Similarly to the PT-BR variant, we opted for a learning rate of 1e-5 with linear decay and 10k warm-up steps.
 However, since the number of training examples is approximately twice of that in the PT-BR variant, we reduced the number of training epochs to half and completed only 25 epochs, which resulted in approximately 245k steps.
 The model was trained for 3 days on a2-highgpu-8gb Google Cloud A2 VMs with 8 GPUs, 96 vCPUs and 680 GB of RAM.