abdouaziiz commited on
Commit
514daa2
1 Parent(s): 8cb26da

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: fr
3
+ license: mit
4
+ tags:
5
+ - roberta
6
+ - language-model
7
+ - wo
8
+ - wolof
9
+ - french
10
+ ---
11
+
12
+ # Soraberta: Unsupervised Language Model Pre-training for Wolof
13
+
14
+ **Soraberta** is pretrained roberta-base model on wolof language . Roberta was introduced in [this paper](https://arxiv.org/abs/1907.11692)
15
+
16
+ ## Soraberta models
17
+
18
+ | Model name | Number of layers | Attention Heads | Embedding Dimension | Total Parameters |
19
+ | :------: | :---: | :---: | :---: | :---: |
20
+ | `soraberta-base` | 6 | 12 | 514 | 18 M |
21
+
22
+
23
+
24
+
25
+ ## Using Soraberta with Hugging Face's Transformers
26
+
27
+
28
+ ```python
29
+ >>> from transformers import pipeline
30
+ >>> unmasker = pipeline('fill-mask', model='abdouaziz/soraberta')
31
+ >>> unmasker("juroom naari jullit man nanoo boole jend aw nag walla <mask>.")
32
+
33
+ [{'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla gileem.',
34
+ 'score': 0.9783930778503418,
35
+ 'token': 4621,
36
+ 'token_str': ' gileem'},
37
+ {'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla jend.',
38
+ 'score': 0.009271537885069847,
39
+ 'token': 2155,
40
+ 'token_str': ' jend'},
41
+ {'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla aw.',
42
+ 'score': 0.0027585660573095083,
43
+ 'token': 704,
44
+ 'token_str': ' aw'},
45
+ {'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla pel.',
46
+ 'score': 0.001120452769100666,
47
+ 'token': 1171,
48
+ 'token_str': ' pel'},
49
+ {'sequence': 'juroom naari jullit man nanoo boole jend aw nag walla juum.',
50
+ 'score': 0.0005133090307936072,
51
+ 'token': 5820,
52
+ 'token_str': ' juum'}]
53
+ ```
54
+
55
+ ## Training data
56
+ The data sources are [Bible OT](http://biblewolof.com/) , [WOLOF-ONELINE](http://www.wolof-online.com/)
57
+
58
+
59
+
60
+ ## Contact
61
+
62
+ Please contact abdouaziz@gmail.com for any question, feedback or request.