Update README.md
Browse files
README.md
CHANGED
@@ -14,7 +14,7 @@ datasets:
|
|
14 |
## Model description
|
15 |
|
16 |
This model is a BERT based Myanmar pre-trained language model.
|
17 |
-
MyanBERTa
|
18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
19 |
|
20 |
```
|
|
|
14 |
## Model description
|
15 |
|
16 |
This model is a BERT based Myanmar pre-trained language model.
|
17 |
+
MyanBERTa was pre-trained for 528K steps on a word segmented Myanmar dataset consisting of 5,992,299 sentences (136M words).
|
18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
19 |
|
20 |
```
|