Update README.md
Browse files
README.md
CHANGED
@@ -10,12 +10,15 @@ language:
|
|
10 |
license: "apache-2.0"
|
11 |
---
|
12 |
|
|
|
13 |
|
14 |
-
|
|
|
|
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
19 |
|
20 |
You may also interested in,
|
21 |
|
|
|
10 |
license: "apache-2.0"
|
11 |
---
|
12 |
|
13 |
+
## CINO: Pre-trained Language Models for Chinese Minority
|
14 |
|
15 |
+
Multilingual Pre-trained Language Model, such as mBERT, XLM-R, provide multilingual and cross-lingual ability for language understanding.
|
16 |
+
We have seen rapid progress on building multilingual PLMs in recent year.
|
17 |
+
However, there is a lack of contributions on building PLMs on Chines minority languages, which hinders researchers from building powerful NLP systems.
|
18 |
|
19 |
+
To address the absence of Chinese minority PLMs, Joint Laboratory of HIT and iFLYTEK Research (HFL) proposes CINO (Chinese-miNOrity pre-trained language model), which is built on XLM-R with additional pre-training using Chinese minority corpus, such as Tibetan, Mongolian (Uighur form), Uyghur, Kazakh (Arabic form), Korean, Zhuang, Cantonese, etc.
|
20 |
|
21 |
+
Please read our GitHub repository for more details (Chinese): https://github.com/ymcui/Chinese-Minority-PLM
|
22 |
|
23 |
You may also interested in,
|
24 |
|