asahi417 commited on
Commit
b2a9219
1 Parent(s): 3130bac
Files changed (2) hide show
  1. README.md +89 -0
  2. metric_summary.json +1 -1
README.md ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - cardiffnlp/tweet_topic_multi
4
+ metrics:
5
+ - f1
6
+ - accuracy
7
+ model-index:
8
+ - name: twitter-roberta-base-dec2020-tweet-topic-multi-all
9
+ results:
10
+ - task:
11
+ type: text-classification
12
+ name: Text Classification
13
+ dataset:
14
+ name: cardiffnlp/tweet_topic_multi
15
+ type: cardiffnlp/tweet_topic_multi
16
+ args: cardiffnlp/tweet_topic_multi
17
+ split: test_2021
18
+ metrics:
19
+ - name: F1
20
+ type: f1
21
+ value: 0.7599173553719007
22
+ - name: F1 (macro)
23
+ type: f1_macro
24
+ value: 0.5990098728991452
25
+ - name: Accuracy
26
+ type: accuracy
27
+ value: 0.5360333531864205
28
+ pipeline_tag: text-classification
29
+ widget:
30
+ - text: "I'm sure the {@Tampa Bay Lightning@} would’ve rather faced the Flyers but man does their experience versus the Blue Jackets this year and last help them a lot versus this Islanders team. Another meat grinder upcoming for the good guys"
31
+ example_title: "Example 1"
32
+ - text: "Love to take night time bike rides at the jersey shore. Seaside Heights boardwalk. Beautiful weather. Wishing everyone a safe Labor Day weekend in the US."
33
+ example_title: "Example 2"
34
+ ---
35
+ # twitter-roberta-base-dec2020-tweet-topic-multi-all
36
+
37
+ This model is a fine-tuned version of [cardiffnlp/twitter-roberta-base-dec2020](https://huggingface.co/cardiffnlp/twitter-roberta-base-dec2020) on the [tweet_topic_multi](https://huggingface.co/datasets/cardiffnlp/tweet_topic_multi). This model is fine-tuned on `train_all` split and validated on `test_2021` split of tweet_topic.
38
+ Fine-tuning script can be found [here](https://huggingface.co/datasets/cardiffnlp/tweet_topic_multi/blob/main/lm_finetuning.py). It achieves the following results on the test_2021 set:
39
+
40
+ - F1 (micro): 0.7599173553719007
41
+ - F1 (macro): 0.5990098728991452
42
+ - Accuracy: 0.5360333531864205
43
+
44
+
45
+ ### Usage
46
+
47
+ ```python
48
+ import math
49
+ import torch
50
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
51
+
52
+ def sigmoid(x):
53
+ return 1 / (1 + math.exp(-x))
54
+
55
+ tokenizer = AutoTokenizer.from_pretrained(twitter-roberta-base-dec2020-tweet-topic-multi-all)
56
+ model = AutoModelForSequenceClassification.from_pretrained(twitter-roberta-base-dec2020-tweet-topic-multi-all, problem_type="multi_label_classification")
57
+ model.eval()
58
+ class_mapping = model.config.id2label
59
+
60
+ with torch.no_grad():
61
+ text = #NewVideo Cray Dollas- Water- Ft. Charlie Rose- (Official Music Video)- {{URL}} via {@YouTube@} #watchandlearn {{USERNAME}}
62
+ tokens = tokenizer(text, return_tensors='pt')
63
+ output = model(**tokens)
64
+ flags = [sigmoid(s) > 0.5 for s in output[0][0].detach().tolist()]
65
+ topic = [class_mapping[n] for n, i in enumerate(flags) if i]
66
+ print(topic)
67
+ ```
68
+
69
+ ### Reference
70
+ If you use any resource from T-NER, please consider to cite our [paper](https://aclanthology.org/2021.eacl-demos.7/).
71
+
72
+ ```
73
+
74
+ @inproceedings{dimosthenis-etal-2022-twitter,
75
+ title = "{T}witter {T}opic {C}lassification",
76
+ author = "Antypas, Dimosthenis and
77
+ Ushio, Asahi and
78
+ Camacho-Collados, Jose and
79
+ Neves, Leonardo and
80
+ Silva, Vitor and
81
+ Barbieri, Francesco",
82
+ booktitle = "Proceedings of the 29th International Conference on Computational Linguistics",
83
+ month = oct,
84
+ year = "2022",
85
+ address = "Gyeongju, Republic of Korea",
86
+ publisher = "International Committee on Computational Linguistics"
87
+ }
88
+
89
+ ```
metric_summary.json CHANGED
@@ -1 +1 @@
1
- {"test/eval_loss": 0.1078503206372261, "test/eval_f1": 0.7599173553719007, "test/eval_f1_macro": 0.5990098728991452, "test/eval_accuracy": 0.5360333531864205, "test/eval_runtime": 73.6316, "test/eval_samples_per_second": 22.803, "test/eval_steps_per_second": 1.426}
 
1
+ {"test/eval_loss": 0.1078503206372261, "test/eval_f1": 0.7599173553719007, "test/eval_f1_macro": 0.5990098728991452, "test/eval_accuracy": 0.5360333531864205, "test/eval_runtime": 53.3062, "test/eval_samples_per_second": 31.497, "test/eval_steps_per_second": 1.97}