abdouaziiz commited on
Commit
df8bdb4
1 Parent(s): da7fbc0

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: wo
3
+ tags:
4
+ - bert
5
+ - language-model
6
+ - wo
7
+ - wolof
8
+ ---
9
+
10
+ # Soraberta: Unsupervised Language Model Pre-training for Wolof
11
+
12
+ **bert-base-wolof** is pretrained bert-base model on wolof language .
13
+
14
+ ## Soraberta models
15
+
16
+ | Model name | Number of layers | Attention Heads | Embedding Dimension | Total Parameters |
17
+ | :------: | :---: | :---: | :---: | :---: |
18
+ | `bert-base` | 6 | 12 | 514 | 56931622 M |
19
+
20
+
21
+
22
+
23
+ ## Using Soraberta with Hugging Face's Transformers
24
+
25
+
26
+ ```python
27
+ >>> from transformers import pipeline
28
+ >>> unmasker = pipeline('fill-mask', model='abdouaziiz/soraberta')
29
+ >>> unmasker("juroom naari jullit man nanoo boole jend aw nag walla <mask>.")
30
+
31
+ [{'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla gileem.',
32
+ 'score': 0.9783930778503418,
33
+ 'token': 4621,
34
+ 'token_str': ' gileem'},
35
+ {'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla jend.',
36
+ 'score': 0.009271537885069847,
37
+ 'token': 2155,
38
+ 'token_str': ' jend'},
39
+ {'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla aw.',
40
+ 'score': 0.0027585660573095083,
41
+ 'token': 704,
42
+ 'token_str': ' aw'},
43
+ {'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla pel.',
44
+ 'score': 0.001120452769100666,
45
+ 'token': 1171,
46
+ 'token_str': ' pel'},
47
+ {'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla juum.',
48
+ 'score': 0.0005133090307936072,
49
+ 'token': 5820,
50
+ 'token_str': ' juum'}]
51
+ ```
52
+
53
+ ## Training data
54
+ The data sources are [Bible OT](http://biblewolof.com/) , [WOLOF-ONLINE](http://www.wolof-online.com/)
55
+ [ALFFA_PUBLIC](https://github.com/getalp/ALFFA_PUBLIC/tree/master/ASR/WOLOF)
56
+
57
+
58
+
59
+ ## Contact
60
+
61
+ Please contact abdouaziz@gmail.com for any question, feedback or request.