Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
# FongBERT
|
2 |
|
3 |
-
FongBERT is a BERT model trained on
|
4 |
It is the first pretrained model to leverage transfer learning for downtream tasks for Fon.
|
5 |
Below are some examples of missing word prediction.
|
6 |
|
@@ -18,18 +18,18 @@ fill = pipeline('fill-mask', model=model, tokenizer=tokenizer)
|
|
18 |
|
19 |
#### Example 1
|
20 |
|
21 |
-
**Sentence 1**:
|
22 |
|
23 |
-
**Masked Sentence**:
|
24 |
|
25 |
-
fill(f'
|
26 |
|
27 |
-
[{'score': 0.
|
28 |
-
'sequence': '
|
29 |
-
'token':
|
30 |
-
'token_str': '
|
31 |
-
{'score': 0.
|
32 |
-
'sequence': '
|
33 |
...........]
|
34 |
|
35 |
|
@@ -39,12 +39,12 @@ fill(f'wa wazɔ xa {fill.tokenizer.mask_token}')
|
|
39 |
|
40 |
**Masked Sentence**: un yi <"mask"> nu we ɖesu . **Translation**: I <"mask"> you so much.
|
41 |
|
42 |
-
[{'score': 0.
|
43 |
'sequence': 'un yi wan nu we ɖesu',
|
44 |
-
'token':
|
45 |
'token_str': ' wan'},
|
46 |
-
{'score': 0.
|
47 |
-
'sequence': 'un yi
|
48 |
...........]
|
49 |
|
50 |
|
@@ -54,11 +54,10 @@ fill(f'wa wazɔ xa {fill.tokenizer.mask_token}')
|
|
54 |
|
55 |
**Masked Sentence**: un yì cí sunnu xɔ́ntɔn ce Tony gɔ́n nú <"mask"> ɖé . **Translation**: I went to my boyfriend for a <"mask">.
|
56 |
|
57 |
-
[{'score': 0.
|
58 |
-
'sequence': 'un yì cí sunnu xɔ́ntɔn ce Tony gɔ́n nú é ɖé',
|
59 |
-
'token': 278,
|
60 |
-
'token_str': ' é'},
|
61 |
-
{'score': 0.1764318197965622,
|
62 |
'sequence': 'un yì cí sunnu xɔ́ntɔn ce Tony gɔ́n nú táan ɖé',
|
63 |
-
'token':
|
|
|
|
|
|
|
64 |
...........]
|
|
|
1 |
# FongBERT
|
2 |
|
3 |
+
FongBERT is a BERT model trained on 68.363 sentences in [Fon](https://en.wikipedia.org/wiki/Fon_language). The data are compiled from [JW300](https://opus.nlpl.eu/JW300.php) and other additional data I scraped from the [JW](https://www.jw.org/en/) website.
|
4 |
It is the first pretrained model to leverage transfer learning for downtream tasks for Fon.
|
5 |
Below are some examples of missing word prediction.
|
6 |
|
|
|
18 |
|
19 |
#### Example 1
|
20 |
|
21 |
+
**Sentence 1**: un tuùn ɖɔ un jló na wazɔ̌ nú we . **Translation**: I know, I have to work for you.
|
22 |
|
23 |
+
**Masked Sentence**: un tuùn ɖɔ un jló na wazɔ̌ <"mask"> we . **Translation**: I know, I have to work <"mask"> you.
|
24 |
|
25 |
+
fill(f'un tuùn ɖɔ un jló na wazɔ̌ {fill.tokenizer.mask_token} we')
|
26 |
|
27 |
+
[{'score': 0.994536280632019,
|
28 |
+
'sequence': 'un tuùn ɖɔ un jló na wazɔ̌ nú we',
|
29 |
+
'token': 312,
|
30 |
+
'token_str': ' nú'},
|
31 |
+
{'score': 0.0015309195732697845,
|
32 |
+
'sequence': 'un tuùn ɖɔ un jló na wazɔ̌nu we',
|
33 |
...........]
|
34 |
|
35 |
|
|
|
39 |
|
40 |
**Masked Sentence**: un yi <"mask"> nu we ɖesu . **Translation**: I <"mask"> you so much.
|
41 |
|
42 |
+
[{'score': 0.31483960151672363,
|
43 |
'sequence': 'un yi wan nu we ɖesu',
|
44 |
+
'token': 639,
|
45 |
'token_str': ' wan'},
|
46 |
+
{'score': 0.20940221846103668,
|
47 |
+
'sequence': 'un yi ba nu we ɖesu',
|
48 |
...........]
|
49 |
|
50 |
|
|
|
54 |
|
55 |
**Masked Sentence**: un yì cí sunnu xɔ́ntɔn ce Tony gɔ́n nú <"mask"> ɖé . **Translation**: I went to my boyfriend for a <"mask">.
|
56 |
|
57 |
+
[{'score': 0.934298574924469,
|
|
|
|
|
|
|
|
|
58 |
'sequence': 'un yì cí sunnu xɔ́ntɔn ce Tony gɔ́n nú táan ɖé',
|
59 |
+
'token': 1102,
|
60 |
+
'token_str': ' táan'},
|
61 |
+
{'score': 0.03750855475664139,
|
62 |
+
'sequence': 'un yì cí sunnu xɔ́ntɔn ce Tony gɔ́n nú ganxixo ɖé',
|
63 |
...........]
|