Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,17 @@
|
|
2 |
language: ar
|
3 |
tags:
|
4 |
- qarib
|
5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
license: apache-2.0
|
7 |
datasets:
|
8 |
- Arabic GigaWord
|
@@ -26,11 +36,11 @@ For Tweets, the data was collected using twitter API and using language filter.
|
|
26 |
## Training QARiB
|
27 |
The training of the model has been performed using Google’s original Tensorflow code on Google Cloud TPU v2.
|
28 |
We used a Google Cloud Storage bucket, for persistent storage of training data and models.
|
29 |
-
See more details in [Training QARiB](
|
30 |
|
31 |
## Using QARiB
|
32 |
|
33 |
-
You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions on a task that interests you. For more details, see [Using QARiB](
|
34 |
|
35 |
### How to use
|
36 |
You can use this model directly with a pipeline for masked language modeling:
|
@@ -85,12 +95,21 @@ We evaluated QARiB models on five NLP downstream task:
|
|
85 |
|
86 |
The results obtained from QARiB models outperforms multilingual BERT/AraBERT/ArabicBERT.
|
87 |
|
88 |
-
|
89 |
## Model Weights and Vocab Download
|
90 |
-
|
|
|
91 |
|
92 |
## Contacts
|
93 |
|
94 |
Ahmed Abdelali, Sabit Hassan, Hamdy Mubarak, Kareem Darwish and Younes Samih
|
95 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
96 |
|
|
|
2 |
language: ar
|
3 |
tags:
|
4 |
- qarib
|
5 |
+
- pytorch
|
6 |
+
- tf
|
7 |
+
datasets:
|
8 |
+
- arabic_billion_words
|
9 |
+
- open_subtitles
|
10 |
+
- twitter
|
11 |
+
metrics:
|
12 |
+
- f1
|
13 |
+
widget:
|
14 |
+
- text: " شو عندكم يا [MASK] ."
|
15 |
+
---
|
16 |
license: apache-2.0
|
17 |
datasets:
|
18 |
- Arabic GigaWord
|
|
|
36 |
## Training QARiB
|
37 |
The training of the model has been performed using Google’s original Tensorflow code on Google Cloud TPU v2.
|
38 |
We used a Google Cloud Storage bucket, for persistent storage of training data and models.
|
39 |
+
See more details in [Training QARiB](https://github.com/qcri/QARIB/Training_QARiB.md)
|
40 |
|
41 |
## Using QARiB
|
42 |
|
43 |
+
You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions on a task that interests you. For more details, see [Using QARiB](https://github.com/qcri/QARIB/Using_QARiB.md)
|
44 |
|
45 |
### How to use
|
46 |
You can use this model directly with a pipeline for masked language modeling:
|
|
|
95 |
|
96 |
The results obtained from QARiB models outperforms multilingual BERT/AraBERT/ArabicBERT.
|
97 |
|
|
|
98 |
## Model Weights and Vocab Download
|
99 |
+
|
100 |
+
From Huggingface site: https://huggingface.co/qarib/qarib/bert-base-qarib60_1790k
|
101 |
|
102 |
## Contacts
|
103 |
|
104 |
Ahmed Abdelali, Sabit Hassan, Hamdy Mubarak, Kareem Darwish and Younes Samih
|
105 |
|
106 |
+
## Reference
|
107 |
+
```
|
108 |
+
@article{abdelali2020qarib,
|
109 |
+
title={QARiB: QCRI Arabic and Dialectal BERT},
|
110 |
+
author={Ahmed, Abdelali and Sabit, Hassan and Hamdy, Mubarak and Kareem, Darwish and Younes, Samih},
|
111 |
+
link={https://github.com/qcri/QARIB},
|
112 |
+
year={2020}
|
113 |
+
}
|
114 |
+
```
|
115 |
|