cardiffnlp
/

xlm-twitter-politics-sentiment

Text Classification

generated_from_keras_callback

Inference Endpoints

Model card Files Files and versions Community

antypasd commited on Aug 2, 2022

Commit

222085a

•

1 Parent(s): 3ebc4c5

Update README.md

Files changed (1) hide show

README.md +42 -23

README.md CHANGED Viewed

@@ -6,42 +6,61 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information Keras had access to. You should
-probably proofread and complete it, then remove this comment. -->
 # XLM-T-Sent-Politics
-This model was trained from scratch on an unknown dataset.
-It achieves the following results on the evaluation set:
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- optimizer: None
-- training_precision: float32
-### Training results
-### Framework versions
-- Transformers 4.21.0
-- TensorFlow 2.8.2
-- Datasets 2.4.0
-- Tokenizers 0.12.1

   results: []
 ---
 # XLM-T-Sent-Politics
+This is an "extension" of the multilingual `twitter-xlm-roberta-base-sentiment` ([model](cardiffnlp/twitter-xlm-roberta-base-sentiment), [original paper](https://arxiv.org/abs/2104.12250) model with a focus on sentiment from politicians' tweets. The original sentiment fine-tuning was done on 8 languages (Ar, En, Fr, De, Hi, It, Sp, Pt) but further training was done using tweets from Members of Parliament from UK (English), Spain (Spanish) and Greece (Greek).
+- Reference Paper: [Politics, Sentiment and Virality: A Large-Scale Multilingual Twitter Analysis in Greece, Spain and United Kingdom](https://arxiv.org/pdf/2202.00396.pdf).
+- Git Repo: [https://github.com/cardiffnlp/politics-and-virality-twitter](https://github.com/cardiffnlp/politics-and-virality-twitter).
+## Full classification example
+```python
+from transformers import AutoModelForSequenceClassification
+from transformers import TFAutoModelForSequenceClassification
+from transformers import AutoTokenizer
+import numpy as np
+from scipy.special import softmax
+MODEL = f"antypasd/XLM-T-Sent-Politics"
+tokenizer = AutoTokenizer.from_pretrained(MODEL)
+# PT
+model = AutoModelForSequenceClassification.from_pretrained(MODEL)
+model.save_pretrained(MODEL)
+text = "Good night 😊"
+text = preprocess(text)
+encoded_input = tokenizer(text, return_tensors='pt')
+output = model(**encoded_input)
+scores = output[0][0].detach().numpy()
+scores = softmax(scores)
+# # TF
+# model = TFAutoModelForSequenceClassification.from_pretrained(MODEL)
+# model.save_pretrained(MODEL)
+# text = "Good night 😊"
+# encoded_input = tokenizer(text, return_tensors='tf')
+# output = model(encoded_input)
+# scores = output[0][0].numpy()
+# scores = softmax(scores)
+# Print labels and scores
+ranking = np.argsort(scores)
+#ranking = ranking[::-1]
+for i in range(scores.shape[0]):
+    s = scores[ranking[i]]
+    print(i, s)
+```
+Output:
+```
+0 0.0048229103
+1 0.03117284
+2 0.9640044
+```