Updated README
Browse files
README.md
CHANGED
@@ -24,7 +24,7 @@ It was then converted to the WordPiece format used by BERT.
|
|
24 |
|
25 |
## Pretraining
|
26 |
|
27 |
-
We used the BERT-base configuration with 12 layers, 768 hidden units, 12 heads,
|
28 |
|
29 |
## Citation
|
30 |
|
|
|
24 |
|
25 |
## Pretraining
|
26 |
|
27 |
+
We used the BERT-base configuration with 12 layers, 768 hidden units, 12 heads, 512 sequence length, 128 mini-batch size and 32k token vocabulary.
|
28 |
|
29 |
## Citation
|
30 |
|