tiroberta-pos / README.md
fgaim's picture
Update
56201f9
metadata
language: ti
widget:
  - text: ድምጻዊ ኣብርሃም ኣፈወርቂ ንዘልኣለም ህያው ኮይኑ ኣብ ልብና ይነብር
datasets:
  - TLMD
  - NTC
metrics:
  - f1
  - precision
  - recall
  - accuracy
model-index:
  - name: tiroberta-base-pos
    results:
      - task:
          name: Token Classification
          type: token-classification
        metrics:
          - name: F1
            type: f1
            value: 0.9562
          - name: Precision
            type: precision
            value: 0.9562
          - name: Recall
            type: recall
            value: 0.9562
          - name: Accuracy
            type: accuracy
            value: 0.9562

Tigrinya POS tagging with TiRoBERTa

This model is a fine-tuned version of TiRoBERTa on the NTC-v1 dataset (Tedla et al. 2016).

Training

Hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10.0

Results

The model achieves the following results on the test set:

  • Loss: 0.3194
  • Adj Precision: 0.9219
  • Adj Recall: 0.9335
  • Adj F1: 0.9277
  • Adj Number: 1670
  • Adv Precision: 0.8297
  • Adv Recall: 0.8554
  • Adv F1: 0.8423
  • Adv Number: 484
  • Con Precision: 0.9844
  • Con Recall: 0.9763
  • Con F1: 0.9804
  • Con Number: 972
  • Fw Precision: 0.7895
  • Fw Recall: 0.5357
  • Fw F1: 0.6383
  • Fw Number: 28
  • Int Precision: 0.6552
  • Int Recall: 0.7308
  • Int F1: 0.6909
  • Int Number: 26
  • N Precision: 0.9650
  • N Recall: 0.9662
  • N F1: 0.9656
  • N Number: 3992
  • Num Precision: 0.9747
  • Num Recall: 0.9665
  • Num F1: 0.9706
  • Num Number: 239
  • N Prp Precision: 0.9308
  • N Prp Recall: 0.9447
  • N Prp F1: 0.9377
  • N Prp Number: 470
  • N V Precision: 0.9854
  • N V Recall: 0.9736
  • N V F1: 0.9794
  • N V Number: 416
  • Pre Precision: 0.9722
  • Pre Recall: 0.9625
  • Pre F1: 0.9673
  • Pre Number: 907
  • Pro Precision: 0.9448
  • Pro Recall: 0.9236
  • Pro F1: 0.9341
  • Pro Number: 445
  • Pun Precision: 1.0
  • Pun Recall: 0.9994
  • Pun F1: 0.9997
  • Pun Number: 1607
  • Unc Precision: 1.0
  • Unc Recall: 0.875
  • Unc F1: 0.9333
  • Unc Number: 16
  • V Precision: 0.8780
  • V Recall: 0.9231
  • V F1: 0.9
  • V Number: 78
  • V Aux Precision: 0.9685
  • V Aux Recall: 0.9878
  • V Aux F1: 0.9780
  • V Aux Number: 654
  • V Ger Precision: 0.9388
  • V Ger Recall: 0.9571
  • V Ger F1: 0.9479
  • V Ger Number: 513
  • V Imf Precision: 0.9634
  • V Imf Recall: 0.9497
  • V Imf F1: 0.9565
  • V Imf Number: 914
  • V Imv Precision: 0.8793
  • V Imv Recall: 0.7286
  • V Imv F1: 0.7969
  • V Imv Number: 70
  • V Prf Precision: 0.8960
  • V Prf Recall: 0.9082
  • V Prf F1: 0.9020
  • V Prf Number: 294
  • V Rel Precision: 0.9678
  • V Rel Recall: 0.9538
  • V Rel F1: 0.9607
  • V Rel Number: 757
  • Overall Precision: 0.9562
  • Overall Recall: 0.9562
  • Overall F1: 0.9562
  • Overall Accuracy: 0.9562

Framework versions

  • Transformers 4.12.0.dev0
  • Pytorch 1.9.0+cu111
  • Datasets 1.13.3
  • Tokenizers 0.10.3

Citation

If you use this model in your product or research, please cite as follows:

@article{Fitsum2021TiPLMs,
  author={Fitsum Gaim and Wonsuk Yang and Jong C. Park},
  title={Monolingual Pre-trained Language Models for Tigrinya},
  year=2021,
  publisher={WiNLP 2021/EMNLP 2021}
}

References

Tedla, Y., Yamamoto, K. & Marasinghe, A. 2016.
Tigrinya Part-of-Speech Tagging with Morphological Patterns and the New Nagaoka Tigrinya Corpus.
International Journal Of Computer Applications 146 pp. 33-41 (2016).