cardiffnlp
/

twitter-roberta-base-sensitive-multilabel

Text Classification

Model card Files Files and versions Community

twitter-roberta-base-sensitive-multilabel / README.md

antypasd's picture

Update README.md

19529f4 verified 12 days ago

|

1.74 kB

	---
	language:
	- en
	license: mit
	datasets:
	- cardiffnlp/x_sensitive
	metrics:
	- f1
	widget:
	- text: Call me today to earn some money mofos!
	pipeline_tag: text-classification
	---

	# twitter-roberta-base-sensitive-binary

	This is a RoBERTa-base model trained on 154M tweets until the end of December 2022 and finetuned for detecting sensitive content (multilabel classification) on the [_X-Sensitive_](https://huggingface.co/datasets/cardiffnlp/x_sensitive) dataset.
	The original Twitter-based RoBERTa model can be found [here](https://huggingface.co/cardiffnlp/twitter-roberta-base-2022-154m).



	## Labels
	```
	"id2label": {
	"0": "conflictual",
	"1": "profanity",
	"2": "sex",
	"3": "drugs",
	"4": "selfharm",
	"5": "spam"
	"6": "not-sensitive"
	}
	```

	## Full classification example

	```python
	from transformers import pipeline

	pipe = pipeline(model='cardiffnlp/twitter-roberta-base-sensitive-multilabel')
	text = "Call me today to earn some money mofos!"

	pipe(text)
	```
	Output:

	```
	[[{'label': 'conflictual', 'score': 0.07463070750236511},
	{'label': 'profanity', 'score': 0.9888035655021667},
	{'label': 'sex', 'score': 0.0032050721347332},
	{'label': 'drugs', 'score': 0.004522938746958971},
	{'label': 'selfharm', 'score': 0.0036733713932335377},
	{'label': 'spam', 'score': 0.007278479170054197},
	{'label': 'not-sensitive', 'score': 0.00972921121865511}]]
	```



	## BibTeX entry and citation info

	```
	@article{antypas2024sensitive,
	title={Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation},
	author={Antypas, Dimosthenis and Sen, Indira and Perez-Almendros, Carla and Camacho-Collados, Jose and Barbieri, Francesco},
	journal={arXiv preprint arXiv:2411.19832},
	year={2024}
	}
	```