moha commited on
Commit
e0ca590
1 Parent(s): 447b55e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -1
README.md CHANGED
@@ -4,7 +4,7 @@ widget:
4
  - text: "للوقايه من انتشار [MASK]"
5
  ---
6
  # arabert_c19: An Arabert model pretrained on 1.5 million COVID-19 multi-dialect Arabic tweets
7
- **mBERT COVID-19** is a pretrained (fine-tuned) version of the mBERT model (https://huggingface.co/bert-base-multilingual-cased). The pretraining was done using 1.5 million multi-dialect Arabic tweets regarding the COVID-19 pandemic from the “Large Arabic Twitter Dataset on COVID-19” (https://arxiv.org/abs/2004.04315).
8
  The model can achieve better results for the tasks that deal with multi-dialect Arabic tweets in relation to the COVID-19 pandemic.
9
 
10
  # Classification results for multiple tasks including fake-news and hate speech detection when using arabert_c19 and mbert_ar_c19:
@@ -25,5 +25,21 @@ arabert_prep = ArabertPreprocessor(model_name=model_name)
25
  text = "للوقايه من عدم انتشار كورونا عليك اولا غسل اليدين بالماء والصابون وتكون عملية الغسل دقيقه تشمل راحة اليد الأصابع التركيز على الإبهام"
26
  arabert_prep.preprocess(text)
27
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  # Contacts
29
  **Hadj Ameur**: [Github](https://github.com/MohamedHadjAmeur) | <mohamedhadjameur@gmail.com> | <mhadjameur@cerist.dz>
4
  - text: "للوقايه من انتشار [MASK]"
5
  ---
6
  # arabert_c19: An Arabert model pretrained on 1.5 million COVID-19 multi-dialect Arabic tweets
7
+ **mBERT COVID-19** [Arxiv URL](https://arxiv.org/pdf/2105.03143.pdf) is a pretrained (fine-tuned) version of the mBERT model (https://huggingface.co/bert-base-multilingual-cased). The pretraining was done using 1.5 million multi-dialect Arabic tweets regarding the COVID-19 pandemic from the “Large Arabic Twitter Dataset on COVID-19” (https://arxiv.org/abs/2004.04315).
8
  The model can achieve better results for the tasks that deal with multi-dialect Arabic tweets in relation to the COVID-19 pandemic.
9
 
10
  # Classification results for multiple tasks including fake-news and hate speech detection when using arabert_c19 and mbert_ar_c19:
25
  text = "للوقايه من عدم انتشار كورونا عليك اولا غسل اليدين بالماء والصابون وتكون عملية الغسل دقيقه تشمل راحة اليد الأصابع التركيز على الإبهام"
26
  arabert_prep.preprocess(text)
27
  ```
28
+
29
+ # Citation
30
+
31
+ Please cite as:
32
+
33
+ ``` bibtex
34
+ @misc{ameur2021aracovid19mfh,
35
+ title={AraCOVID19-MFH: Arabic COVID-19 Multi-label Fake News and Hate Speech Detection Dataset},
36
+ author={Mohamed Seghir Hadj Ameur and Hassina Aliane},
37
+ year={2021},
38
+ eprint={2105.03143},
39
+ archivePrefix={arXiv},
40
+ primaryClass={cs.CL}
41
+ }
42
+ ```
43
+
44
  # Contacts
45
  **Hadj Ameur**: [Github](https://github.com/MohamedHadjAmeur) | <mohamedhadjameur@gmail.com> | <mhadjameur@cerist.dz>