Mongolian Language Models 🇲🇳
Collection
8 items
•
Updated
•
2
Here we provide pretrained ALBERT model and trained SentencePiece model for Mongolia text. Training data is the Mongolian wikipedia corpus from Wikipedia Downloads and Mongolian News corpus.
loss = 1.7478163
masked_lm_accuracy = 0.6838185
masked_lm_loss = 1.6687671
sentence_order_accuracy = 0.998125
sentence_order_loss = 0.007942731
precision recall f1-score support
байгал орчин 0.85 0.83 0.84 999
боловсрол 0.80 0.80 0.80 873
спорт 0.98 0.98 0.98 2736
технологи 0.88 0.93 0.91 1102
улс төр 0.92 0.85 0.89 2647
урлаг соёл 0.93 0.94 0.94 1457
хууль 0.89 0.87 0.88 1651
эдийн засаг 0.83 0.88 0.86 2509
эрүүл мэнд 0.89 0.92 0.90 1159
accuracy 0.90 15133
macro avg 0.89 0.89 0.89 15133
weighted avg 0.90 0.90 0.90 15133
@misc{albert-mongolian,
author = {Bayartsogt Yadamsuren},
title = {ALBERT Pretrained Model on Mongolian Datasets},
year = {2020},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/bayartsogt-ya/albert-mongolian/}}
}
Please contact by bayartsogtyadamsuren@icloud.com