iceman2434
commited on
Commit
β’
340e484
1
Parent(s):
99a063d
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- universal_dependencies
|
4 |
+
language:
|
5 |
+
- tl
|
6 |
+
metrics:
|
7 |
+
- f1
|
8 |
+
pipeline_tag: token-classification
|
9 |
+
---
|
10 |
+
|
11 |
+
## Model Specification
|
12 |
+
- Model: RoBERTa Tagalog Base (Jan Christian Blaise Cruz)
|
13 |
+
- Randomized training order of languages
|
14 |
+
- Training Data:
|
15 |
+
- Combined English, Serbian, Slovenian, Naija, & Manx-Cadhan corpora (Top 5 Languages)
|
16 |
+
- Training Details:
|
17 |
+
- Base configurations with learning rate 5e-5
|
18 |
+
## Evaluation
|
19 |
+
- Evaluation Dataset: Universal Dependencies Tagalog Ugnayan (Testing Set)
|
20 |
+
- Tested in a zero-shot cross-lingual scenario on a Universal Dependencies Tagalog Ugnayan testing dataset (with 72.52\% Accuracy)
|
21 |
+
## POS Tags
|
22 |
+
- ADJ β ADP β ADV β CCONJ β DET β INTJ β NOUN β NUM β PART β PRON β PROPN β PUNCT β SCONJ β VERB
|