--- tags: - MLM model-index: - name: RobertaSin results: [] widget: - text: අපි තමයි [MASK] කරේ. - text: මට හෙට එන්න වෙන්නේ [MASK]. - text: අපි ගෙදර [MASK]. - text: සිංහල සහ [MASK] අලුත් අවුරුද්ද. license: apache-2.0 language: - si --- # SinhalaRoberta - Pretrained Roberta for Sinhala MLM tasks. This model is trained on various Sinhala corpus extracted from News and articles. ## Model description Trained on MLM tasks, Please use [MASK] token to indicate masked token. The model comprises a total of 68 million parameters ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 16 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3 ### Framework versions - Transformers 4.26.1 - Pytorch 1.13.0 - Datasets 2.1.0 - Tokenizers 0.13.2