iceman2434
/

roberta-tagalog-base-ft-udpos213-top4langrandom

Token Classification

Model card Files Files and versions Community

roberta-tagalog-base-ft-udpos213-top4langrandom / README.md

iceman2434's picture

Create README.md

3342c07 verified 5 months ago

|

history blame contribute delete

745 Bytes

	---
	datasets:
	- universal_dependencies
	language:
	- tl
	metrics:
	- f1
	pipeline_tag: token-classification
	---

	## Model Specification
	- Model: RoBERTa Tagalog Base (Jan Christian Blaise Cruz)
	- Randomized training order of languages
	- Training Data:
	- Combined English, Serbian, Slovenian, & Naija corpora (Top 4 Languages)
	- Training Details:
	- Base configurations with learning rate 5e-5
	## Evaluation
	- Evaluation Dataset: Universal Dependencies Tagalog Ugnayan (Testing Set)
	- Tested in a zero-shot cross-lingual scenario on a Universal Dependencies Tagalog Ugnayan testing dataset (with 72.97\% Accuracy)
	## POS Tags
	- ADJ – ADP – ADV – CCONJ – DET – INTJ – NOUN – NUM – PART – PRON – PROPN – PUNCT – SCONJ – VERB