ahmedabdelali
/

bert-base-qarib60_1970k

Inference Endpoints

Model card Files Files and versions Community

Ahmed Abdelali commited on Jan 20, 2021

Commit

78f607a

•

1 Parent(s): 32bf566

Update config/readme

Files changed (2) hide show

README.md +22 -9
config.json +3 -0

README.md CHANGED Viewed

@@ -1,15 +1,18 @@
 ---
 language: ar
 tags:
 - tf
 - qarib
-license: apache-2.0
 datasets:
-- Arabic GigaWord
-- Abulkhair Arabic Corpus
-- opus
-- Twitter data
 ---
 # QARiB: QCRI Arabic and Dialectal BERT
@@ -27,11 +30,11 @@ For Tweets, the data was collected using twitter API and using language filter.
 ## Training QARiB
 The training of the model has been performed using Google’s original Tensorflow code on Google Cloud TPU v2.
 We used a Google Cloud Storage bucket, for persistent storage of training data and models.
-See more details in [Training QARiB](../Training_QARiB.md)
 ## Using QARiB
-You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions on a task that interests you. For more details, see [Using QARiB](../Using_QARiB.md)
 ### How to use
 You can use this model directly with a pipeline for masked language modeling:
@@ -88,10 +91,20 @@ The results obtained from QARiB models outperforms multilingual BERT/AraBERT/Ara
 ## Model Weights and Vocab Download
-TBD
 ## Contacts
 Ahmed Abdelali, Sabit Hassan, Hamdy Mubarak, Kareem Darwish and Younes Samih

 ---
 language: ar
 tags:
+- pytorch
 - tf
 - qarib
+- qarib60_1790k
 datasets:
+- arabic_billion_words
+- open_subtitles
+- twitter
+metrics:
+- f1
+widget:
+ - text: " شو عندكم يا [MASK] ."
 ---
 # QARiB: QCRI Arabic and Dialectal BERT
 ## Training QARiB
 The training of the model has been performed using Google’s original Tensorflow code on Google Cloud TPU v2.
 We used a Google Cloud Storage bucket, for persistent storage of training data and models.
+See more details in [Training QARiB](https://github.com/qcri/QARIB/Training_QARiB.md)
 ## Using QARiB
+You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions on a task that interests you. For more details, see [Using QARiB](https://github.com/qcri/QARIB/Using_QARiB.md)
 ### How to use
 You can use this model directly with a pipeline for masked language modeling:
 ## Model Weights and Vocab Download
+From Huggingface site: https://huggingface.co/qarib/qarib/bert-base-qarib60_1970k
 ## Contacts
 Ahmed Abdelali, Sabit Hassan, Hamdy Mubarak, Kareem Darwish and Younes Samih
+## Reference
+```
+@article{abdelali2020qarib,
+  title={QARiB: QCRI Arabic and Dialectal BERT},
+  author={Ahmed, Abdelali and Sabit, Hassan and Hamdy, Mubarak and Kareem, Darwish and Younes, Samih},
+  link={https://github.com/qcri/QARIB},
+  year={2020}
+}
+```

config.json CHANGED Viewed

@@ -1,4 +1,7 @@
 {
   "model_type": "bert",
   "attention_probs_dropout_prob": 0.1,
   "directionality": "bidi",

 {
+  "architectures": [
+    "BertForMaskedLM"
+  ],
   "model_type": "bert",
   "attention_probs_dropout_prob": 0.1,
   "directionality": "bidi",