javier-ab-bsc commited on
Commit
ed68404
1 Parent(s): 1117366

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -149,7 +149,7 @@ to be adapted before continuing its pre-training with data in the target languag
149
  1) We trained our own BPE tokenizer for Catalan, Spanish, and English, and replaced the original BLOOM tokenizer and vocabulary with it. This procedure implied a downsizing of the original BLOOM's embedding layer and, therefore, a model compression from 7.1B parameters to 6.3B.
150
  2) The embeddings corresponding to tokens that are present in both the original and the target vocabulary (matching tokens) were used for initialization.
151
  3) The embeddings from tokens not present in BLOOM's original vocabulary were initialized as the average of all embeddings.
152
- 4) The model was initialized with the weights from BOOM-7.1B, and with our adapted tokenizer (step 1) and embeddings (steps 2-3).
153
  5) The model was then trained on a corpus that contains a mixture of Catalan, Spanish, and English data.
154
 
155
  ### Training data
 
149
  1) We trained our own BPE tokenizer for Catalan, Spanish, and English, and replaced the original BLOOM tokenizer and vocabulary with it. This procedure implied a downsizing of the original BLOOM's embedding layer and, therefore, a model compression from 7.1B parameters to 6.3B.
150
  2) The embeddings corresponding to tokens that are present in both the original and the target vocabulary (matching tokens) were used for initialization.
151
  3) The embeddings from tokens not present in BLOOM's original vocabulary were initialized as the average of all embeddings.
152
+ 4) The model was initialized with the weights from BLOOM-7.1B, and with our adapted tokenizer (step 1) and embeddings (steps 2-3).
153
  5) The model was then trained on a corpus that contains a mixture of Catalan, Spanish, and English data.
154
 
155
  ### Training data