Edit model card

Model Card for Kushtrim/bert-base-multilingual-cased-sq-sentiment-sst2

Model Description

This model is a sentiment analysis tool specifically tailored for the Albanian language. It leverages the BERT architecture, which is renowned for its effectiveness in understanding the context of a word in a sentence. The model is fine-tuned on a multilingual BERT base model to specifically enhance its performance on Albanian texts. It's designed to classify sentiments as either positive or negative.

Intended Use

Primary Use: Sentiment analysis for Albanian text. Target Audience: Data scientists, NLP practitioners, researchers, and businesses interested in understanding sentiment in Albanian language texts. Application Examples: Analyzing customer feedback, social media monitoring, market research.

Training Data

The model is trained on the SST2 (Stanford Sentiment Treebank 2) dataset that has been machine-translated into Albanian. The SST2 dataset is originally in English and comprises sentences from movie reviews, annotated for sentiment (positive/negative). This rich dataset provides a broad range of colloquial and formal language use, reflecting a wide array of sentiments. The machine translation process aimed to retain the sentiment and linguistic nuances of the original dataset while adapting it to the Albanian linguistic context. However, potential translation inaccuracies may affect the model's understanding and classification of sentiments in certain cases.

Limitations

The model might not perform well on dialects or slang. Context understanding might be limited in complex sentences. Performance might degrade on texts from domains not represented in the training set.

Ethical Considerations

Care should be taken not to use the model to amplify biases present in the training data. The model should not be used for manipulative or harmful purposes, such as influencing political elections. Usage Instructions

Instructions on how to implement and use the model

from transformers import pipeline

classifier = pipeline("sentiment-analysis", model="Kushtrim/bert-base-multilingual-cased-sq-sentiment-sst2")
output = classifier("Filmi ishte i bukur")
print(output)
Downloads last month
3
Safetensors
Model size
178M params
Tensor type
F32
·

Dataset used to train Kushtrim/bert-base-multilingual-cased-sq-sentiment-sst2

Collection including Kushtrim/bert-base-multilingual-cased-sq-sentiment-sst2