Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Fine-tuned CAMeL-BERT Model for Sentiment Analysis in Moroccan Darija

Model Name: CAMeL-BERT Fine-Tuned for Moroccan Darija Sentiment Analysis
Model ID: NerdyPy/fine_tuned_model_sentiment_analysis
Language: Arabic (Modern Standard Arabic and Moroccan Darija)
Task: Sentiment Analysis (Negative, Neutral, Positive)


Model Description

This model is a fine-tuned version of the CAMeL-Lab BERT model, specifically adapted for sentiment analysis in Moroccan Darija, a highly under-resourced Arabic dialect. The model has been trained to classify Arabic text—including both Modern Standard Arabic (MSA) and Moroccan Darija—into three sentiment categories:

  • Negative
  • Neutral
  • Positive

By focusing on Moroccan Darija, this model addresses the scarcity of NLP resources for this dialect, enhancing sentiment analysis capabilities in mixed-language contexts common in Moroccan user-generated content.


Intended Use

Primary Use Case

  • Sentiment analysis of user-generated content, such as comments and reviews, in Moroccan Darija and MSA.

Applications

  • Analyzing public opinion on social media platforms and electronic journals.
  • Assisting researchers in understanding societal attitudes and trends.
  • Supporting policymakers and organizations in gauging public sentiment.

Users

  • Researchers and data scientists in NLP.
  • Organizations analyzing Arabic-language social media.
  • Developers building sentiment analysis tools for Arabic dialects.

Limitations and Risks

Dialectal Variations

  • Performance may vary on other Arabic dialects not represented in the training data.

Data Bias

  • The model may reflect biases present in the training datasets.

Language Mixing (Code-Switching)

The model may face challenges when processing text that heavily mixes Moroccan Darija with other languages (e.g., French, English, Spanish). This could affect the accuracy of sentiment classification in such cases. For example: "واش كتفهم le français؟" In this sentence, the speaker switches from Moroccan Darija to French within the same sentence. The model, primarily trained on Arabic text, may not accurately interpret the sentiment due to unfamiliarity with the non-Arabic portion.

Generalization

  • Limited performance on topics or vocabulary outside the training data.

How to Use

You can use this model with the Hugging Face Transformers library:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("NerdyPy/fine_tuned_model_sentiment_analysis")
model = AutoModelForSequenceClassification.from_pretrained("NerdyPy/fine_tuned_model_sentiment_analysis")

# Example text in Arabic
text = "العمل في هذا المكان كان رائعاً، ولكن شي مرات ما كاينش التنظيم"
Downloads last month
44
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for NerdyPy/fine_tuned_model_sentiment_analysis

Finetuned
(2)
this model