File size: 1,185 Bytes
58344d9
51534d7
 
58344d9
51534d7
 
4830455
83babd6
4830455
 
911e2e4
 
 
51534d7
911e2e4
4830455
 
 
51534d7
 
 
 
4830455
51534d7
 
 
 
83babd6
 
51534d7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
language: twi
license: mit
---
## TwiBERT
## Model Description
TwiBERT is a pre-trained language model specifically designed for the Twi language, which is widely spoken in Ghana, 
West Africa. This model has 61 million parameters, 6 layers, 6 attention heads, 768 hidden units, and a feed-forward size of 3072. 
To optimize its performance, TwiBERT was trained using a combination of the Asanti Twi Bible and a dataset 
sourced through crowdsourcing efforts. 



## Limitations:

The model was trained on a relatively limited dataset (approximately 5MB), 
which may hinder its ability to learn intricate contextual embeddings and effectively generalize. 
Additionally, the dataset's focus on the Bible could potentially introduce a strong religious bias in the model's output. 


## How to use it

You can use TwiBERT by finetuning it on a downtream task. 
The example code below illustrates how you can use the TwiBERT model on a downtream task:

```python
>>> from transformers import AutoTokenizer, AutoModelForTokenClassification
>>> model = AutoModelForTokenClassification.from_pretrained("sakrah/TwiBERT")
>>> tokenizer = AutoTokenizer.from_pretrained("sakrah/TwiBERT")
```