projecte-aina
/

aguila-7b

Text Generation

RefinedWebModel

Model card Files Files and versions Community

gonzalez-agirre commited on Jul 11, 2023

Commit

a94e933

•

1 Parent(s): 5110c34

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -174,7 +174,7 @@ The dataset has the following language distribution:
 ## Training procedure
-The training corpus has been tokenized using a byte version of [Byte-Pair Encoding (BPE)](https://github.com/openai/gpt-2) used in the original [RoBERTA](https://github.com/pytorch/fairseq/tree/master/examples/roberta) model with a vocabulary size of 50,262 tokens. Once the model has been successfully initialized, we continued its pre-training in the three target languages: Catalan, Spanish, and English. We kept a small amount of English in order to avoid catastrophic forgetting. The training lasted a total of 96 hours with 8 NVIDIA H100 GPUs of 80GB of RAM.
 ### Training hyperparameters

 ## Training procedure
+The training corpus has been tokenized using a byte version of [Byte-Pair Encoding (BPE)](https://github.com/openai/gpt-2) used in the original [RoBERTA](https://github.com/pytorch/fairseq/tree/master/examples/roberta) model with a vocabulary size of 50,257 tokens. Once the model has been successfully initialized, we continued its pre-training in the three target languages: Catalan, Spanish, and English. We kept a small amount of English in order to avoid catastrophic forgetting. The training lasted a total of 96 hours with 8 NVIDIA H100 GPUs of 80GB of RAM.
 ### Training hyperparameters