antypasd commited on
Commit
222085a
•
1 Parent(s): 3ebc4c5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -23
README.md CHANGED
@@ -6,42 +6,61 @@ model-index:
6
  results: []
7
  ---
8
 
9
- <!-- This model card has been generated automatically according to the information Keras had access to. You should
10
- probably proofread and complete it, then remove this comment. -->
11
-
12
  # XLM-T-Sent-Politics
13
 
14
- This model was trained from scratch on an unknown dataset.
15
- It achieves the following results on the evaluation set:
16
-
17
 
18
- ## Model description
 
19
 
20
- More information needed
21
 
22
- ## Intended uses & limitations
23
 
24
- More information needed
 
 
 
 
 
25
 
26
- ## Training and evaluation data
27
 
28
- More information needed
29
 
30
- ## Training procedure
 
 
31
 
32
- ### Training hyperparameters
 
 
 
 
 
33
 
34
- The following hyperparameters were used during training:
35
- - optimizer: None
36
- - training_precision: float32
37
 
38
- ### Training results
 
 
 
 
39
 
 
 
 
 
 
 
40
 
 
41
 
42
- ### Framework versions
43
 
44
- - Transformers 4.21.0
45
- - TensorFlow 2.8.2
46
- - Datasets 2.4.0
47
- - Tokenizers 0.12.1
 
 
6
  results: []
7
  ---
8
 
 
 
 
9
  # XLM-T-Sent-Politics
10
 
11
+ This is an "extension" of the multilingual `twitter-xlm-roberta-base-sentiment` ([model](cardiffnlp/twitter-xlm-roberta-base-sentiment), [original paper](https://arxiv.org/abs/2104.12250) model with a focus on sentiment from politicians' tweets. The original sentiment fine-tuning was done on 8 languages (Ar, En, Fr, De, Hi, It, Sp, Pt) but further training was done using tweets from Members of Parliament from UK (English), Spain (Spanish) and Greece (Greek).
 
 
12
 
13
+ - Reference Paper: [Politics, Sentiment and Virality: A Large-Scale Multilingual Twitter Analysis in Greece, Spain and United Kingdom](https://arxiv.org/pdf/2202.00396.pdf).
14
+ - Git Repo: [https://github.com/cardiffnlp/politics-and-virality-twitter](https://github.com/cardiffnlp/politics-and-virality-twitter).
15
 
 
16
 
17
+ ## Full classification example
18
 
19
+ ```python
20
+ from transformers import AutoModelForSequenceClassification
21
+ from transformers import TFAutoModelForSequenceClassification
22
+ from transformers import AutoTokenizer
23
+ import numpy as np
24
+ from scipy.special import softmax
25
 
26
+ MODEL = f"antypasd/XLM-T-Sent-Politics"
27
 
28
+ tokenizer = AutoTokenizer.from_pretrained(MODEL)
29
 
30
+ # PT
31
+ model = AutoModelForSequenceClassification.from_pretrained(MODEL)
32
+ model.save_pretrained(MODEL)
33
 
34
+ text = "Good night 😊"
35
+ text = preprocess(text)
36
+ encoded_input = tokenizer(text, return_tensors='pt')
37
+ output = model(**encoded_input)
38
+ scores = output[0][0].detach().numpy()
39
+ scores = softmax(scores)
40
 
41
+ # # TF
42
+ # model = TFAutoModelForSequenceClassification.from_pretrained(MODEL)
43
+ # model.save_pretrained(MODEL)
44
 
45
+ # text = "Good night 😊"
46
+ # encoded_input = tokenizer(text, return_tensors='tf')
47
+ # output = model(encoded_input)
48
+ # scores = output[0][0].numpy()
49
+ # scores = softmax(scores)
50
 
51
+ # Print labels and scores
52
+ ranking = np.argsort(scores)
53
+ #ranking = ranking[::-1]
54
+ for i in range(scores.shape[0]):
55
+ s = scores[ranking[i]]
56
+ print(i, s)
57
 
58
+ ```
59
 
60
+ Output:
61
 
62
+ ```
63
+ 0 0.0048229103
64
+ 1 0.03117284
65
+ 2 0.9640044
66
+ ```