Description

The dataset consists of 148 Filipino storytelling books, 4,523 sentences, 7,118 tokens, and 868 unique tokens.
This NER model only supports the Filipino language and does not include proper nouns, verbs, adjectives, and adverbs as of the moment
The input must undergo preprocessing. Soon I will upload the code to GitHub for preprocessing the input
To replicate the preprocessed input use this example as a guide
Input: "May umaapoy na bahay "
Preprocessed Input: "apoy bahay"

roberta-tagalog-large-ner-v1

This model is a fine-tuned version of jcblaise/roberta-tagalog-large on the None dataset. It achieves the following results on the evaluation set:

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
No log	1.0	205	0.2044	0.8945	0.8920	0.8933	0.9414
No log	2.0	410	0.1421	0.9410	0.9341	0.9375	0.9625
0.2423	3.0	615	0.1485	0.9309	0.9500	0.9403	0.9670
0.2423	4.0	820	0.1543	0.9473	0.9505	0.9489	0.9689
0.0154	5.0	1025	0.1749	0.9494	0.9494	0.9494	0.9706
0.0154	6.0	1230	0.1706	0.9459	0.9545	0.9502	0.9713
0.0154	7.0	1435	0.1822	0.9490	0.9522	0.9506	0.9717
0.003	8.0	1640	0.1841	0.9529	0.9540	0.9534	0.9723
0.003	9.0	1845	0.1870	0.9540	0.9551	0.9545	0.9729
0.0007	10.0	2050	0.1866	0.9546	0.9557	0.9551	0.9724