The AraRoBERTa models are mono-dialectal Arabic models trained on a country-level dialect. AraRoBERTa uses RoBERTa base config. More details are available in the paper click.

The following are the AraRoBERTa seven dialectal variations:

When using the model, please cite our paper:

@inproceedings{alyami-al-zaidy-2022-weakly,
    title = "Weakly and Semi-Supervised Learning for {A}rabic Text Classification using Monodialectal Language Models",
    author = "AlYami, Reem  and Al-Zaidy, Rabah",
    booktitle = "Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP)",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, United Arab Emirates (Hybrid)",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.wanlp-1.24",
    pages = "260--272",
}

Contact

Reem AlYami: Linkedin | reem.yami@kfupm.edu.sa | yami.m.reem@gmail.com

Downloads last month
22
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.