Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,7 @@
|
|
1 |
---
|
2 |
license: cc
|
|
|
|
|
3 |
---
|
4 |
|
5 |
# Bio-ELECTRA Base 1m (cased)
|
@@ -34,7 +36,7 @@ to the properly tokenized and segmented sentences.
|
|
34 |
## Pretraining
|
35 |
|
36 |
The model is pretrained on a single 8 core version 3 tensor processing unit (TPU) with 128 GB of RAM for 1,000,000 steps
|
37 |
-
with a batch size of 256. The training
|
38 |
12 transformers layers with hidden layer size of 768 and 12 attention heads.
|
39 |
|
40 |
|
|
|
1 |
---
|
2 |
license: cc
|
3 |
+
language:
|
4 |
+
- en
|
5 |
---
|
6 |
|
7 |
# Bio-ELECTRA Base 1m (cased)
|
|
|
36 |
## Pretraining
|
37 |
|
38 |
The model is pretrained on a single 8 core version 3 tensor processing unit (TPU) with 128 GB of RAM for 1,000,000 steps
|
39 |
+
with a batch size of 256. The training parameters were the same as the original ELECTRA base model. The model has 110M parameters,
|
40 |
12 transformers layers with hidden layer size of 768 and 12 attention heads.
|
41 |
|
42 |
|