Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
## MorRoBERTa
|
2 |
+
MorRoBERTa, designed specifically for the Moroccan Arabic dialect, is a scaled-down variant of the RoBERTa-base model. It comprises 6 layers, 12 attention heads, and 768 hidden dimensions. The training process spanned approximately 92 hours, covering 12 epochs on the complete training set. A vast corpus of six million Moroccan dialect sentences, amounting to 71 billion tokens, was used to train this model.
|
3 |
+
|
4 |
+
# Usage
|
5 |
+
The model weights can be loaded using transformers library by HuggingFace.
|
6 |
+
|
7 |
+
from transformers import AutoTokenizer, AutoModel
|
8 |
+
|
9 |
+
tokenizer = AutoTokenizer.from_pretrained("otmangi/MorRoBERTa")
|
10 |
+
model = AutoModel.from_pretrained("otmangi/MorRoBERTa")
|