|
--- |
|
tags: |
|
- generated_from_keras_callback |
|
model-index: |
|
- name: XLM-T-Sent-Politics |
|
results: [] |
|
--- |
|
|
|
# XLM-T-Sent-Politics |
|
|
|
This is an "extension" of the multilingual `twitter-xlm-roberta-base-sentiment` model ([model](https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base-sentiment), [original paper](https://arxiv.org/abs/2104.12250)) with a focus on sentiment from politicians' tweets. The original sentiment fine-tuning was done on 8 languages (Ar, En, Fr, De, Hi, It, Sp, Pt) but further training was done using tweets from Members of Parliament from UK (English), Spain (Spanish) and Greece (Greek). |
|
|
|
- Reference Paper: [Politics, Sentiment and Virality: A Large-Scale Multilingual Twitter Analysis in Greece, Spain and United Kingdom](https://arxiv.org/pdf/2202.00396.pdf). |
|
- Git Repo: [https://github.com/cardiffnlp/politics-and-virality-twitter](https://github.com/cardiffnlp/politics-and-virality-twitter). |
|
|
|
|
|
## Full classification example |
|
|
|
```python |
|
from transformers import AutoModelForSequenceClassification |
|
from transformers import TFAutoModelForSequenceClassification |
|
from transformers import AutoTokenizer |
|
import numpy as np |
|
from scipy.special import softmax |
|
|
|
MODEL = f"cardiffnlp/xlm-twitter-politics-sentiment" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(MODEL) |
|
|
|
# PT |
|
model = AutoModelForSequenceClassification.from_pretrained(MODEL) |
|
|
|
text = "Good night π" |
|
text = preprocess(text) |
|
encoded_input = tokenizer(text, return_tensors='pt') |
|
output = model(**encoded_input) |
|
scores = output[0][0].detach().numpy() |
|
scores = softmax(scores) |
|
|
|
# # TF |
|
# model = TFAutoModelForSequenceClassification.from_pretrained(MODEL) |
|
# model.save_pretrained(MODEL) |
|
|
|
# text = "Good night π" |
|
# encoded_input = tokenizer(text, return_tensors='tf') |
|
# output = model(encoded_input) |
|
# scores = output[0][0].numpy() |
|
# scores = softmax(scores) |
|
|
|
# Print labels and scores |
|
ranking = np.argsort(scores) |
|
for i in range(scores.shape[0]): |
|
s = scores[ranking[i]] |
|
print(i, s) |
|
|
|
``` |
|
|
|
Output: |
|
|
|
``` |
|
0 0.0048229103 |
|
1 0.03117284 |
|
2 0.9640044 |
|
``` |
|
|