Update README.md
Browse files
README.md
CHANGED
@@ -17,17 +17,11 @@ This model is a BERT based Myanmar pre-trained language model.
|
|
17 |
MyanBERTa was pre-trained for 528K steps on a word segmented Myanmar dataset consisting of 5,992,299 sentences (136M words).
|
18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
19 |
|
20 |
-
```
|
21 |
-
Contributed by:
|
22 |
-
Aye Mya Hlaing
|
23 |
-
Win Pa Pa
|
24 |
-
```
|
25 |
-
|
26 |
Cite this work as:
|
27 |
|
28 |
```
|
29 |
Aye Mya Hlaing, Win Pa Pa, "MyanBERTa: A Pre-trained Language Model For
|
30 |
-
Myanmar", In Proceedings of 2022 International Conference on Communication and Computer Research (ICCR2022), November 2022, Korea
|
31 |
```
|
32 |
|
33 |
[Download Paper](https://journal-home.s3.ap-northeast-2.amazonaws.com/site/iccr2022/abs/QOHFI-0004.pdf)
|
|
|
17 |
MyanBERTa was pre-trained for 528K steps on a word segmented Myanmar dataset consisting of 5,992,299 sentences (136M words).
|
18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
Cite this work as:
|
21 |
|
22 |
```
|
23 |
Aye Mya Hlaing, Win Pa Pa, "MyanBERTa: A Pre-trained Language Model For
|
24 |
+
Myanmar", In Proceedings of 2022 International Conference on Communication and Computer Research (ICCR2022), November 2022, Seoul, Republic of Korea
|
25 |
```
|
26 |
|
27 |
[Download Paper](https://journal-home.s3.ap-northeast-2.amazonaws.com/site/iccr2022/abs/QOHFI-0004.pdf)
|