Davlan commited on
Commit
24ced61
1 Parent(s): 6cb0d71

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -1,3 +1,21 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - yo
5
  ---
6
+
7
+
8
+ # oyo-teams-base-discriminator
9
+
10
+ OYO-BERT (or Oyo-dialect of Yoruba BERT) was created by pre-training a [TEAMS model based on ELECTRA architecture](https://aclanthology.org/2021.findings-acl.219/) on Yoruba language texts for about 100K steps.
11
+ It was trained using ELECTRA-base architecture with [Tensorflow Model Garden](https://github.com/tensorflow/models/tree/master/official/projects)
12
+
13
+ ### Pre-training corpus
14
+ A mix of WURA, Wikipedia and MT560 Yoruba data
15
+
16
+
17
+ ### Acknowledgment
18
+ We thank [@stefan-it](https://github.com/stefan-it) for providing the pre-processing and pre-training scripts. Finally, we would like to thank Google Cloud for providing us access to TPU v3-8 through the free cloud credits. Model trained using flax, before converted to pytorch.
19
+
20
+
21
+ ### BibTeX entry and citation info.