updating yoruba readme
Browse files
README.md
CHANGED
@@ -37,17 +37,17 @@ You can use this model with Transformers *pipeline* for masked token prediction.
|
|
37 |
#### Limitations and bias
|
38 |
This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.
|
39 |
## Training data
|
40 |
-
This model was fine-tuned on Bible, JW300, [Menyo-20k](https://huggingface.co/datasets/menyo20k_mt), [Yoruba Embedding corpus
|
41 |
|
42 |
## Training procedure
|
43 |
This model was trained on a single NVIDIA V100 GPU
|
44 |
|
45 |
-
## Eval results on Test set (F-score)
|
46 |
-
Dataset|F1
|
47 |
-
|
48 |
-
|
49 |
-
MasakhaNER |
|
50 |
-
BBC
|
51 |
|
52 |
### BibTeX entry and citation info
|
53 |
By David Adelani
|
37 |
#### Limitations and bias
|
38 |
This model is limited by its training dataset of entity-annotated news articles from a specific span of time. This may not generalize well for all use cases in different domains.
|
39 |
## Training data
|
40 |
+
This model was fine-tuned on Bible, JW300, [Menyo-20k](https://huggingface.co/datasets/menyo20k_mt), [Yoruba Embedding corpus(https://huggingface.co/datasets/yoruba_text_c3) and [CC-Aligned](https://opus.nlpl.eu/), Wikipedia, news corpora (BBC Yoruba, VON Yoruba, Asejere, Alaroye), and other small datasets curated from friends.
|
41 |
|
42 |
## Training procedure
|
43 |
This model was trained on a single NVIDIA V100 GPU
|
44 |
|
45 |
+
## Eval results on Test set (F-score, average over 5 runs)
|
46 |
+
Dataset| mBERT F1 | yo_bert F1
|
47 |
+
-|-|-
|
48 |
+
[Yorùbá GV NER](https://huggingface.co/datasets/yoruba_gv_ner) | |
|
49 |
+
[MasakhaNER](https://github.com/masakhane-io/masakhane-ner) | 78.97 |
|
50 |
+
[BBC Yorùbá Textclass](https://huggingface.co/datasets/yoruba_bbc_topics) | 75.13 | 79.11
|
51 |
|
52 |
### BibTeX entry and citation info
|
53 |
By David Adelani
|