File size: 3,506 Bytes
e94878a
 
 
 
 
56201f9
 
 
 
 
 
 
e94878a
56201f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e94878a
 
 
 
 
56201f9
e94878a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56201f9
e94878a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56201f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
---
language: ti
widget:
- text: "ድምጻዊ ኣብርሃም ኣፈወርቂ ንዘልኣለም ህያው ኮይኑ ኣብ ልብና ይነብር"
datasets:
- TLMD
- NTC
metrics:
- f1
- precision
- recall
- accuracy
model-index:
- name: tiroberta-base-pos
  results:
  - task:
      name: Token Classification
      type: token-classification
    metrics:
    - name: F1
      type: f1
      value: 0.9562
    - name: Precision
      type: precision
      value: 0.9562
    - name: Recall
      type: recall
      value: 0.9562
    - name: Accuracy
      type: accuracy
      value: 0.9562
---


# Tigrinya POS tagging with TiRoBERTa

This model is a fine-tuned version of [TiRoBERTa](https://huggingface.co/fgaim/tiroberta) on the NTC-v1 dataset (Tedla et al. 2016).

## Training

### Hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10.0

### Results

The model achieves the following results on the test set:
- Loss: 0.3194
- Adj Precision: 0.9219
- Adj Recall: 0.9335
- Adj F1: 0.9277
- Adj Number: 1670
- Adv Precision: 0.8297
- Adv Recall: 0.8554
- Adv F1: 0.8423
- Adv Number: 484
- Con Precision: 0.9844
- Con Recall: 0.9763
- Con F1: 0.9804
- Con Number: 972
- Fw Precision: 0.7895
- Fw Recall: 0.5357
- Fw F1: 0.6383
- Fw Number: 28
- Int Precision: 0.6552
- Int Recall: 0.7308
- Int F1: 0.6909
- Int Number: 26
- N Precision: 0.9650
- N Recall: 0.9662
- N F1: 0.9656
- N Number: 3992
- Num Precision: 0.9747
- Num Recall: 0.9665
- Num F1: 0.9706
- Num Number: 239
- N Prp Precision: 0.9308
- N Prp Recall: 0.9447
- N Prp F1: 0.9377
- N Prp Number: 470
- N V Precision: 0.9854
- N V Recall: 0.9736
- N V F1: 0.9794
- N V Number: 416
- Pre Precision: 0.9722
- Pre Recall: 0.9625
- Pre F1: 0.9673
- Pre Number: 907
- Pro Precision: 0.9448
- Pro Recall: 0.9236
- Pro F1: 0.9341
- Pro Number: 445
- Pun Precision: 1.0
- Pun Recall: 0.9994
- Pun F1: 0.9997
- Pun Number: 1607
- Unc Precision: 1.0
- Unc Recall: 0.875
- Unc F1: 0.9333
- Unc Number: 16
- V Precision: 0.8780
- V Recall: 0.9231
- V F1: 0.9
- V Number: 78
- V Aux Precision: 0.9685
- V Aux Recall: 0.9878
- V Aux F1: 0.9780
- V Aux Number: 654
- V Ger Precision: 0.9388
- V Ger Recall: 0.9571
- V Ger F1: 0.9479
- V Ger Number: 513
- V Imf Precision: 0.9634
- V Imf Recall: 0.9497
- V Imf F1: 0.9565
- V Imf Number: 914
- V Imv Precision: 0.8793
- V Imv Recall: 0.7286
- V Imv F1: 0.7969
- V Imv Number: 70
- V Prf Precision: 0.8960
- V Prf Recall: 0.9082
- V Prf F1: 0.9020
- V Prf Number: 294
- V Rel Precision: 0.9678
- V Rel Recall: 0.9538
- V Rel F1: 0.9607
- V Rel Number: 757
- Overall Precision: 0.9562
- Overall Recall: 0.9562
- Overall F1: 0.9562
- Overall Accuracy: 0.9562

### Framework versions

- Transformers 4.12.0.dev0
- Pytorch 1.9.0+cu111
- Datasets 1.13.3
- Tokenizers 0.10.3


## Citation

If you use this model in your product or research, please cite as follows:

```
@article{Fitsum2021TiPLMs,
  author={Fitsum Gaim and Wonsuk Yang and Jong C. Park},
  title={Monolingual Pre-trained Language Models for Tigrinya},
  year=2021,
  publisher={WiNLP 2021/EMNLP 2021}
}
```


## References

```
Tedla, Y., Yamamoto, K. & Marasinghe, A. 2016.
Tigrinya Part-of-Speech Tagging with Morphological Patterns and the New Nagaoka Tigrinya Corpus.
International Journal Of Computer Applications 146 pp. 33-41 (2016).
```