Migrate model card from transformers-repo

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/bashar-talafha/multi-dialect-bert-base-arabic/README.md

Files changed (1) hide show

README.md +69 -0

README.md ADDED Viewed

	@@ -0,0 +1,69 @@

+---
+language: ar
+thumbnail: https://raw.githubusercontent.com/mawdoo3/Multi-dialect-Arabic-BERT/master/multidialct_arabic_bert.png
+datasets:
+- nadi
+---
+# Multi-dialect-Arabic-BERT
+This is a repository of Multi-dialect Arabic BERT model.
+By [Mawdoo3-AI](https://ai.mawdoo3.com/).
+<p align="center">
+    <br>
+    <img src="https://raw.githubusercontent.com/mawdoo3/Multi-dialect-Arabic-BERT/master/multidialct_arabic_bert.png" alt="Background reference: http://www.qfi.org/wp-content/uploads/2018/02/Qfi_Infographic_Mother-Language_Final.pdf" width="500"/>
+    <br>
+<p>
+### About our Multi-dialect-Arabic-BERT model
+Instead of training the Multi-dialect Arabic BERT model from scratch, we initialized the weights of the model using [Arabic-BERT](https://github.com/alisafaya/Arabic-BERT) and trained it on 10M arabic tweets from the unlabled data of [The Nuanced Arabic Dialect Identification (NADI) shared task](https://sites.google.com/view/nadi-shared-task).
+### To cite this work
+```
+@misc{talafha2020multidialect,
+    title={Multi-Dialect Arabic BERT for Country-Level Dialect Identification},
+    author={Bashar Talafha and Mohammad Ali and Muhy Eddin Za'ter and Haitham Seelawi and Ibraheem Tuffaha and Mostafa Samir and Wael Farhan and Hussein T. Al-Natsheh},
+    year={2020},
+    eprint={2007.05612},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
+### Usage
+The model weights can be loaded using `transformers` library by HuggingFace.
+```python
+from transformers import AutoTokenizer, AutoModel
+tokenizer = AutoTokenizer.from_pretrained("bashar-talafha/multi-dialect-bert-base-arabic")
+model = AutoModel.from_pretrained("bashar-talafha/multi-dialect-bert-base-arabic")
+```
+Example using `pipeline`:
+```python
+from transformers import pipeline
+fill_mask = pipeline(
+    "fill-mask",
+    model="bashar-talafha/multi-dialect-bert-base-arabic ",
+    tokenizer="bashar-talafha/multi-dialect-bert-base-arabic "
+)
+fill_mask(" سافر الرحالة من مطار [MASK] ")
+```
+```
+[{'sequence': '[CLS] سافر الرحالة من مطار الكويت [SEP]', 'score': 0.08296813815832138, 'token': 3226},
+ {'sequence': '[CLS] سافر الرحالة من مطار دبي [SEP]', 'score': 0.05123933032155037, 'token': 4747},
+ {'sequence': '[CLS] سافر الرحالة من مطار مسقط [SEP]', 'score': 0.046838656067848206, 'token': 13205},
+ {'sequence': '[CLS] سافر الرحالة من مطار القاهرة [SEP]', 'score': 0.03234650194644928, 'token': 4003},
+ {'sequence': '[CLS] سافر الرحالة من مطار الرياض [SEP]', 'score': 0.02606341242790222, 'token': 2200}]
+```
+### Repository
+Please check the [original repository](https://github.com/mawdoo3/Multi-dialect-Arabic-BERT) for more information.