Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,21 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- yo
|
5 |
---
|
6 |
+
|
7 |
+
|
8 |
+
# oyo-teams-base-discriminator
|
9 |
+
|
10 |
+
OYO-BERT (or Oyo-dialect of Yoruba BERT) was created by pre-training a [TEAMS model based on ELECTRA architecture](https://aclanthology.org/2021.findings-acl.219/) on Yoruba language texts for about 100K steps.
|
11 |
+
It was trained using ELECTRA-base architecture with [Tensorflow Model Garden](https://github.com/tensorflow/models/tree/master/official/projects)
|
12 |
+
|
13 |
+
### Pre-training corpus
|
14 |
+
A mix of WURA, Wikipedia and MT560 Yoruba data
|
15 |
+
|
16 |
+
|
17 |
+
### Acknowledgment
|
18 |
+
We thank [@stefan-it](https://github.com/stefan-it) for providing the pre-processing and pre-training scripts. Finally, we would like to thank Google Cloud for providing us access to TPU v3-8 through the free cloud credits. Model trained using flax, before converted to pytorch.
|
19 |
+
|
20 |
+
|
21 |
+
### BibTeX entry and citation info.
|