Edit model card

TiRoBERTa: RoBERTa Pretrained for the Tigrinya Language

We pretrain a RoBERTa base model for Tigrinya on a dataset of 40 million tokens trained for 40 epochs.

Contained in this repo is the original pretrained Flax model that was trained on a TPU v3.8 and it's corresponding PyTorch version.

Hyperparameters

The hyperparameters corresponding to model sizes mentioned above are as follows:

Model Size L AH HS FFN P Seq
BASE 12 12 768 3072 125M 512

(L = number of layers; AH = number of attention heads; HS = hidden size; FFN = feedforward network dimension; P = number of parameters; Seq = maximum sequence length.)

Framework versions

  • Transformers 4.12.0.dev0
  • Pytorch 1.9.0+cu111
  • Datasets 1.13.3
  • Tokenizers 0.10.3

Citation

If you use this model in your product or research, please cite as follows:

@article{Fitsum2021TiPLMs,
  author={Fitsum Gaim and Wonsuk Yang and Jong C. Park},
  title={Monolingual Pre-trained Language Models for Tigrinya},
  year=2021,
  publisher={WiNLP 2021 at EMNLP 2021}
}
Downloads last month
8
Safetensors
Model size
125M params
Tensor type
I64
·
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.