Update README.md
Browse files
README.md
CHANGED
@@ -1,47 +1,42 @@
|
|
1 |
---
|
2 |
-
tags:
|
3 |
-
- generated_from_keras_callback
|
4 |
model-index:
|
5 |
- name: twitter-roberta-base-hate-latest
|
6 |
results: []
|
|
|
7 |
---
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
- Transformers 4.21.2
|
45 |
-
- TensorFlow 2.10.0
|
46 |
-
- Datasets 2.9.0
|
47 |
-
- Tokenizers 0.12.1
|
|
|
1 |
---
|
|
|
|
|
2 |
model-index:
|
3 |
- name: twitter-roberta-base-hate-latest
|
4 |
results: []
|
5 |
+
pipeline_tag: text-classification
|
6 |
---
|
7 |
+
# cardiffnlp/twitter-xlm-roberta-base-hate-spanish
|
8 |
+
|
9 |
+
This model is a fine-tuned version of [cardiffnlp/twitter-roberta-base-2022-154m](https://huggingface.co/cardiffnlp/twitter-roberta-base-2022-154m) for binary hate-speech classification. A combination of 13 different hate-speech datasets in the English language were used to fine-tune the model.
|
10 |
+
|
11 |
+
## Following metrics are achieved
|
12 |
+
| **Dataset** | **Accuracy** | **Macro-F1** | **Weighted-F1** |
|
13 |
+
|------------------------------------------------------------------------------------------------------------------------------------------------------|:------------:|:------------:|:---------------:|
|
14 |
+
| hatEval, SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter | 0.5848 | 0.5657 | 0.5514 |
|
15 |
+
| ucberkeley-dlab/measuring-hate-speech | 0.8706 | 0.8531 | 0.8701 |
|
16 |
+
| Detecting East Asian Prejudice on Social Media | 0.9276 | 0.8935 | 0.9273 |
|
17 |
+
| Call me sexist, but | 0.9033 | 0.6288 | 0.8852 |
|
18 |
+
| Predicting the Type and Target of Offensive Posts in Social Media | 0.9075 | 0.5984 | 0.8935 |
|
19 |
+
| HateXplain | 0.9594 | 0.8024 | 0.9600 |
|
20 |
+
| Large Scale Crowdsourcing and Characterization of Twitter Abusive BehaviorLarge Scale Crowdsourcing and Characterization of Twitter Abusive Behavior | 0.6817 | 0.5939 | 0.6233 |
|
21 |
+
| Twitter Sentiment Analysis | 0.9808 | 0.9258 | 0.9807 |
|
22 |
+
| Overview of the HASOC track at FIRE 2019:Hate Speech and Offensive Content Identification in Indo-European Languages | 0.8665 | 0.5562 | 0.8343 |
|
23 |
+
| Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter | 0.9465 | 0.8557 | 0.9440 |
|
24 |
+
| Automated Hate Speech Detection and the Problem of Offensive Language | 0.9116 | 0.8797 | 0.9100 |
|
25 |
+
| Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter | 0.8378 | 0.8338 | 0.8385 |
|
26 |
+
| Multilingual and Multi-Aspect Hate Speech Analysis | 0.9655 | 0.4912 | 0.9824 |
|
27 |
+
| **Overall** | **0.8827** | **0.8383** | **0.8842** |
|
28 |
+
|
29 |
+
|
30 |
+
### Usage
|
31 |
+
Install tweetnlp via pip.
|
32 |
+
```shell
|
33 |
+
pip install tweetnlp
|
34 |
+
```
|
35 |
+
Load the model in python.
|
36 |
+
```python
|
37 |
+
import tweetnlp
|
38 |
+
model = tweetnlp.Classifier("cardiffnlp/twitter-roberta-base-hate-latest")
|
39 |
+
model.predict('I love everybody :)')
|
40 |
+
>> {'label': 'NOT-HATE'}
|
41 |
+
|
42 |
+
```
|
|
|
|
|
|
|
|