Fill-Mask
Transformers
PyTorch
bert
Inference Endpoints
Edit model card

manchuBERT

This is a BERT-base model trained with romanized Manchu data from scratch.

Data

manchuBERT utilizes the data augmentation method from Mergen: The First Manchu-Korean Machine Translation Model Trained on Augmented Data.

Data Number of Sentences(before augmentation)
Manwén Lˇaodàng–Taizong 2,220
Ilan gurun i bithe 41,904
Gin ping mei bithe 21,376
Yùzhì Q¯ıngwénjiàn 11,954
Yùzhì Zengdìng Q¯ıngwénjiàn 18,420
Manwén Lˇaodàng–Taizu 22,578
Manchu-Korean Dictionary 40,583
Downloads last month
1