ibraheemmoosa's picture
Update language list.
c2fd454
|
raw
history blame
No virus
587 Bytes
metadata
language:
  - as
  - bn
  - gu
  - hi
  - mr
  - ne
  - or
  - pa
  - si
  - sa
  - bpy
  - bh
  - gom
  - mai
license: apache-2.0
datasets:
  - oscar
tags:
  - multilingual
  - albert
  - masked-language-modeling
  - sentence-order-prediction
  - fill-mask
  - nlp

XLMIndic Base Multiscript

Pretrained ALBERT model on the OSCAR corpus on the languages Assamese, Bengali, Bihari, Bishnupriya Manipuri, Goan Konkani, Gujarati, Hindi, Maithili, Marathi, Nepali, Oriya, Panjabi, Sanskrit and Sinhala. Like ALBERT it was pretrained using as masked language modeling (MLM) and a sentence order prediction (SOP) objective.