--- metrics: - f1 - accuracy library_name: transformers pipeline_tag: text-classification --- # Model Card for Model ID This repository contains the implementation and evaluation of various classical machine learning and deep learning models for sentiment analysis in the Albanian language, a low-resource language with unique characteristics and rich semantic nuances. The models used in this research include: Classical Machine Learning Models Logistic Regression Naive Bayes Random Forest Deep Learning Models Transformers: XLM-Roberta, BERT, BigBirdPegasus Recurrent Neural Networks (RNN): LSTM Convolutional Neural Networks (CNN): World2Vec with skip-gram embedding, Keras embedding ## Model Details ### Model Description - **Developed by:** [Florian Hiso] - **Models type:** [Logistic Regression,Naive Bayes,Random Forest,Transformers,RNN,CNN] - **Language(s) (NLP):** [Albanian] - **Finetuned from model** [XLM-Roberta, BERT, BigBirdPegasus] ## Uses This research offers key insights into the performance of different models in sentiment analysis for the Albanian language. The findings indicate a promising level of accuracy from both classical machine learning and deep learning models. These results contribute to our understanding of natural language processing (NLP) techniques and their application in the domain of Albanian sentiment analysis. ### Out-of-Scope Use This repository and the models implemented herein are designed specifically for the purpose of sentiment analysis in the Albanian language. The intended applications are academic research, NLP experimentation, and advancing the understanding of NLP techniques for low-resource languages. Prohibited Uses The following uses are explicitly out of scope and prohibited: Commercial Applications: Any use of the models for commercial purposes without proper authorization. Medical or Legal Advice: The models are not intended to provide medical, legal, or any form of professional advice. Sensitive Content Analysis: The models should not be used to analyze or process content that involves sensitive personal data, hate speech, or explicit content. Real-Time Decision Making: Avoid using these models for real-time decision-making processes, especially those involving critical or high-stakes environments. Misuse of Data: Using the models to infer or generate information that can lead to the misrepresentation of data or harm to individuals or groups. When using these models, it is crucial to consider ethical implications, especially given the sensitive nature of language processing and sentiment analysis. Users should adhere to the following guidelines: Transparency: Be clear about the capabilities and limitations of the models. Privacy: Ensure that any data used in conjunction with the models is anonymized and used in compliance with data protection regulations. Bias and Fairness: Be aware of and mitigate any biases that may arise from the models or the data they are trained on. By adhering to these guidelines and respecting the scope of use, we can ensure that the research and its outcomes are utilized responsibly and ethically.