nreimers commited on
Commit
5acf075
1 Parent(s): 2c75733

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -2,5 +2,7 @@
2
 
3
  This model was initialized with a word2vec token embedding matrix with 256k entries, but these token embeddings were updated during MLM. The word2vec was trained on 100GB data from C4, MSMARCO, News, Wikipedia, S2ORC, for 3 epochs.
4
 
5
- Then the model was trained on this dataset with MLM for 250k steps (batch size 64). The token embeddings were updated during MLM.
 
 
6
 
 
2
 
3
  This model was initialized with a word2vec token embedding matrix with 256k entries, but these token embeddings were updated during MLM. The word2vec was trained on 100GB data from C4, MSMARCO, News, Wikipedia, S2ORC, for 3 epochs.
4
 
5
+ Then the model was trained on this dataset with MLM for 750k steps (batch size 64). The token embeddings were updated during MLM.
6
+
7
+ For the same model but with frozen token embeddings while MLM training see: https://huggingface.co/vocab-transformers/distilbert-word2vec_256k-MLM_750k
8