Edit model card

Pre-trained ELECTRA small for Tigrinya Language

We pre-train ELECTRA small on the TLMD dataset, with over 40 million tokens.

Contained are trained Flax and PyTorch models.

Hyperparameters

The hyperparameters corresponding to model sizes mentioned above are as follows:

Model Size L AH HS FFN P Seq
SMALL 12 4 256 1024 14M 512

(L = number of layers; AH = number of attention heads; HS = hidden size; FFN = feedforward network dimension; P = number of parameters; Seq = maximum sequence length.)

Downloads last month
28
Hosted inference API
Fill-Mask
Examples
Examples
Mask token: [MASK]
This model can be loaded on the Inference API on-demand.