|
--- |
|
datasets: |
|
- universal_dependencies |
|
language: |
|
- tl |
|
metrics: |
|
- f1 |
|
pipeline_tag: token-classification |
|
--- |
|
|
|
## Model Specification |
|
- Model: RoBERTa Tagalog Base (Jan Christian Blaise Cruz) |
|
- Randomized training order of languages |
|
- Training Data: |
|
- Combined English, Serbian, Slovenian, & Naija corpora (Top 4 Languages) |
|
- Training Details: |
|
- Base configurations with learning rate 5e-5 |
|
## Evaluation |
|
- Evaluation Dataset: Universal Dependencies Tagalog Ugnayan (Testing Set) |
|
- Tested in a zero-shot cross-lingual scenario on a Universal Dependencies Tagalog Ugnayan testing dataset (with 72.97\% Accuracy) |
|
## POS Tags |
|
- ADJ β ADP β ADV β CCONJ β DET β INTJ β NOUN β NUM β PART β PRON β PROPN β PUNCT β SCONJ β VERB |