mapama247 commited on
Commit
d917d6b
1 Parent(s): 0f53b7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -179,9 +179,9 @@ Note: A small amount of English data was kept to avoid catastrophic forgetting.
179
 
180
  ## Training procedure
181
 
182
- The training corpus has been tokenized using a byte version of [Byte-Pair Encoding (BPE)](https://github.com/openai/gpt-2) used
183
- in the original [RoBERTA](https://github.com/pytorch/fairseq/tree/master/examples/roberta) model with a vocabulary size of 50,257 tokens.
184
- After training a new tokenizer and adapting [falcon-7b](https://huggingface.co/tiiuae/falcon-7b)'s embedding layer, we continued its pre-training in three target languages: Catalan, Spanish, and English.
185
  The training lasted a total of 320 hours on 8 NVIDIA H100 GPUs with 80GB RAM.
186
 
187
 
 
179
 
180
  ## Training procedure
181
 
182
+ The training corpus has been tokenized using a byte version of [Byte-Pair Encoding (BPE)](https://github.com/openai/gpt-2) with a vocabulary size of 50,257 tokens.
183
+ After training a new tokenizer and adapting [falcon-7b](https://huggingface.co/tiiuae/falcon-7b)'s embedding layer, the model was
184
+ further pre-trained in three target languages: Catalan, Spanish and English.
185
  The training lasted a total of 320 hours on 8 NVIDIA H100 GPUs with 80GB RAM.
186
 
187