Edit model card
YAML Metadata Error: "model-index[0].results[0].dataset.type" with value "social media" fails to match the required pattern: /^(?:[\w-]+\/)?[\w-.]+$/

eevvgg/sentimenTw-political

This model is a fine-tuned version of multilingual model cardiffnlp/twitter-xlm-roberta-base-sentiment. Classification of text sentiment into 3 categories: negative, neutral, positive. Fine-tuned on a 2k sample of manually annotated Reddit (EN) and Twitter (PL) data.

Uses

Sentiment classification in multilingual data. Fine-tuned on a 2k English and Polish sample of social media texts from political domain. Model suited for short text (up to 200 tokens) .

How to Get Started with the Model

from transformers import pipeline

model_path = "eevvgg/sentimenTw-political"
sentiment_task = pipeline(task = "text-classification", model = model_path, tokenizer = model_path)

sequence = ["TRUMP needs undecided voters",
            "Oczywiście ze Pan Prezydent to nasza duma narodowa!!"]
            
result = sentiment_task(sequence)
labels = [i['label'] for i in result] # ['neutral', 'positive']            

Model Sources

@misc{SentimenTwGK2023,
  author={Gajewska, Ewelina and Konat, Barbara},
  title={SentimenTw XLM-RoBERTa-base Model for Multilingual Sentiment Classification on Social Media},
  year={2023},
  howpublished = {\url{https://huggingface.co/eevvgg/sentimenTw-political}},
}

Training Details

  • Trained for 3 epochs, mini-batch size of 8.
  • Training results: loss: 0.515
  • See details in Colab notebook

Preprocessing

  • Hyperlinks and user mentions (@) normalization to "http" and "@user" tokens, respectively. Removal of extra spaces.

Speeds, Sizes, Times

Evaluation

  • Evaluation run on a sample of 200 texts (10% of data).

Results

  • accuracy: 74.0
  • macro avg:
    • f1: 71.2
    • precision: 72.8
    • recall: 70.8
  • weighted avg:
    • f1: 73.3

    • precision: 74.0

    • recall: 74.0

                     precision    recall  f1-score   support
      
         negative      0.752     0.901     0.820        91
         neutral       0.764     0.592     0.667        71
         positive      0.667     0.632     0.649        38
      

Citation

BibTeX:

@misc{SentimenTwGK2023,
  author={Gajewska, Ewelina and Konat, Barbara},
  title={SentimenTw XLM-RoBERTa-base Model for Multilingual Sentiment Classification on Social Media},
  year={2023},
  howpublished = {\url{https://huggingface.co/eevvgg/sentimenTw-political}},
}

APA:

Gajewska, E., & Konat, B. (2023).
SentimenTw XLM-RoBERTa-base Model for Multilingual Sentiment Classification on Social Media.
https://huggingface.co/eevvgg/sentimenTw-political.
Downloads last month
8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

Model card error

This model's model-index metadata is invalid: Schema validation error. "model-index[0].results[0].dataset.type" with value "social media" fails to match the required pattern: /^(?:[\w-]+\/)?[\w-.]+$/