File size: 1,328 Bytes
b198b50
20a3e49
 
 
 
 
 
 
 
 
 
b198b50
 
20a3e49
b198b50
20a3e49
 
b198b50
 
 
20a3e49
 
 
 
 
 
 
b198b50
20a3e49
b198b50
20a3e49
 
 
 
 
b198b50
20a3e49
 
 
b198b50
20a3e49
 
 
b198b50
 
 
20a3e49
b198b50
1cc24b6
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
language:
- en
widget:
- text: Call me today to earn some money mofos!
datasets:
- cardiffnlp/x_sensitive
license: mit
metrics:
- f1
pipeline_tag: text-classification
---

# twitter-roberta-large-sensitive-binary

This is a RoBERTa-large model trained on 154M tweets until the end of December 2022 and finetuned for detecting sensitive content (binary classification) on the [_X-Sensitive_](https://huggingface.co/datasets/cardiffnlp/x_sensitive) dataset.
The original Twitter-based RoBERTa model can be found [here](https://huggingface.co/cardiffnlp/twitter-roberta-large-2022-154m).



## Labels
```
"id2label": {
    "0": "non-sensitive",
    "1": "sensitive"
  }
```

## Full classification example

```python
from transformers import pipeline
    
pipe = pipeline(model='cardiffnlp/twitter-roberta-large-sensitive-binary')
text = "Call me today to earn some money mofos!"

pipe(text)
```
Output: 

```
[{'label': 'sensitive', 'score': 0.999821126461029}]
```



## BibTeX entry and citation info

```
@article{antypas2024sensitive,
  title={Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation},
  author={Antypas, Dimosthenis and Sen, Indira and Perez-Almendros, Carla and Camacho-Collados, Jose and Barbieri, Francesco},
  journal={arXiv preprint arXiv:2411.19832},
  year={2024}
}
```