File size: 3,128 Bytes
ee52490
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e41bd7a
 
 
 
ee52490
e41bd7a
 
 
 
 
 
 
ee52490
e41bd7a
 
ee52490
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
datasets:
- tweet_eval
metrics:
- f1
- accuracy
model-index:
- name: cardiffnlp/roberta-base-offensive
  results:
  - task:
      type: text-classification
      name: Text Classification
    dataset:
      name: tweet_eval
      type: offensive
      split: test 
    metrics:
    - name: Micro F1 (tweet_eval/offensive)
      type: micro_f1_tweet_eval/offensive
      value: 0.8441860465116279
    - name: Macro F1 (tweet_eval/offensive)
      type: micro_f1_tweet_eval/offensive
      value: 0.8038468085106383
    - name: Accuracy (tweet_eval/offensive)
      type: accuracy_tweet_eval/offensive
      value: 0.8441860465116279
pipeline_tag: text-classification
widget:
- text: Get the all-analog Classic Vinyl Edition of "Takin Off" Album from {@herbiehancock@} via {@bluenoterecords@} link below {{URL}}
  example_title: "topic_classification 1" 
- text: Yes, including Medicare and social security saving👍
  example_title: "sentiment 1" 
- text: All two of them taste like ass.
  example_title: "offensive 1" 
- text: If you wanna look like a badass, have drama on social media
  example_title: "irony 1" 
- text: Whoever just unfollowed me you a bitch
  example_title: "hate 1" 
- text: I love swimming for the same reason I love meditating...the feeling of weightlessness.
  example_title: "emotion 1" 
- text: Beautiful sunset last night from the pontoon @TupperLakeNY
  example_title: "emoji 1" 
---
# cardiffnlp/roberta-base-offensive 

This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the 
[`tweet_eval (offensive)`](https://huggingface.co/datasets/tweet_eval) 
via [`tweetnlp`](https://github.com/cardiffnlp/tweetnlp).
Training split is `train` and parameters have been tuned on the validation split `validation`.

Following metrics are achieved on the test split `test` ([link](https://huggingface.co/cardiffnlp/roberta-base-offensive/raw/main/metric.json)).

- F1 (micro): 0.8441860465116279
- F1 (macro): 0.8038468085106383
- Accuracy: 0.8441860465116279

### Usage
Install tweetnlp via pip.
```shell
pip install tweetnlp
```
Load the model in python.
```python
import tweetnlp
model = tweetnlp.Classifier("cardiffnlp/roberta-base-offensive", max_length=128)
model.predict('Get the all-analog Classic Vinyl Edition of "Takin Off" Album from {@herbiehancock@} via {@bluenoterecords@} link below {{URL}}')
```



### Reference 

 
```
@inproceedings{camacho-collados-etal-2022-tweetnlp,
    title={{T}weet{NLP}: {C}utting-{E}dge {N}atural {L}anguage {P}rocessing for {S}ocial {M}edia},
    author={Camacho-Collados, Jose and Rezaee, Kiamehr and Riahi, Talayeh and Ushio, Asahi and Loureiro, Daniel and Antypas, Dimosthenis and Boisson, Joanne and Espinosa-Anke, Luis and Liu, Fangyu and Mart{'\i}nez-C{'a}mara, Eugenio and others},
    author = "Ushio, Asahi  and
      Camacho-Collados, Jose",
    booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = nov,
    year = "2022",
    address = "Abu Dhabi, U.A.E.",
    publisher = "Association for Computational Linguistics",
}
```