Edit model card
YAML Metadata Error: "model-index[0].results[0].dataset.type" with value "social media" fails to match the required pattern: /^(?:[\w-]+\/)?[\w-.]+$/

eevvgg/sentimenTw-political

This model is a fine-tuned version of multilingual model cardiffnlp/twitter-xlm-roberta-base-sentiment. Classification of text sentiment into 3 categories: negative, neutral, positive. Fine-tuned on a 2k sample of manually annotated Reddit (EN) and Twitter (PL) data.

Uses

Sentiment classification in multilingual data. Fine-tuned on a 2k English and Polish sample of social media texts from political domain. Model suited for short text (up to 200 tokens) .

How to Get Started with the Model

from transformers import pipeline

model_path = "eevvgg/sentimenTw-political"
sentiment_task = pipeline(task = "text-classification", model = model_path, tokenizer = model_path)

sequence = ["TRUMP needs undecided voters",
            "Oczywiście ze Pan Prezydent to nasza duma narodowa!!"]
            
result = sentiment_task(sequence)
labels = [i['label'] for i in result] # ['neutral', 'positive']            

Model Sources

@misc{SentimenTwGK2023,
  author={Gajewska, Ewelina and Konat, Barbara},
  title={SentimenTw XLM-RoBERTa-base Model for Multilingual Sentiment Classification on Social Media},
  year={2023},
  howpublished = {\url{https://huggingface.co/eevvgg/sentimenTw-political}},
}

Training Details

  • Trained for 3 epochs, mini-batch size of 8.
  • Training results: loss: 0.515
  • See details in Colab notebook

Preprocessing

  • Hyperlinks and user mentions (@) normalization to "http" and "@user" tokens, respectively. Removal of extra spaces.

Speeds, Sizes, Times

Evaluation

  • Evaluation run on a sample of 200 texts (10% of data).

Results

  • accuracy: 74.0
  • macro avg:
    • f1: 71.2
    • precision: 72.8
    • recall: 70.8
  • weighted avg:
    • f1: 73.3

    • precision: 74.0

    • recall: 74.0

                     precision    recall  f1-score   support
      
         negative      0.752     0.901     0.820        91
         neutral       0.764     0.592     0.667        71
         positive      0.667     0.632     0.649        38
      

Citation

BibTeX:

@misc{SentimenTwGK2023,
  author={Gajewska, Ewelina and Konat, Barbara},
  title={SentimenTw XLM-RoBERTa-base Model for Multilingual Sentiment Classification on Social Media},
  year={2023},
  howpublished = {\url{https://huggingface.co/eevvgg/sentimenTw-political}},
}

APA:

Gajewska, E., & Konat, B. (2023).
SentimenTw XLM-RoBERTa-base Model for Multilingual Sentiment Classification on Social Media.
https://huggingface.co/eevvgg/sentimenTw-political.
Downloads last month
25

Evaluation results

Model card error

This model's model-index metadata is invalid: Schema validation error. "model-index[0].results[0].dataset.type" with value "social media" fails to match the required pattern: /^(?:[\w-]+\/)?[\w-.]+$/