language: | |
- as | |
- bn | |
- gu | |
- hi | |
- mr | |
- ne | |
- or | |
- pa | |
- si | |
license: apache-2.0 | |
datasets: | |
- oscar | |
tags: | |
- multilingual | |
- albert | |
- masked-language-modeling | |
- sentence-order-prediction | |
- fill-mask | |
- nlp | |
# XLMIndic Base Uniscript | |
Pretrained ALBERT model on the OSCAR corpus on the languages Assamese, Bengali, Gujarati, Hindi, Marathi, | |
Nepali, Oriya, Panjabi and Sinhala. Like ALBERT it was pretrained using as masked language modeling (MLM) | |
and a sentence order prediction (SOP) objective. This model was pretrained after transliterating the text | |
to ISO-15919 format using the Aksharamukha library. A demo of Aksharamukha library is hosted [here](https://aksharamukha.appspot.com/converter) | |
where you can transliterate your text and use it on our model on the inference widget. | |