youngggggg commited on
Commit
8a711aa
1 Parent(s): 888a879

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -25
README.md CHANGED
@@ -8,6 +8,9 @@ language:
8
 
9
  # Model Card for ToxiGen-ConPrompt
10
 
 
 
 
11
  <!-- Provide a quick summary of what the model is/does. -->
12
 
13
  <!-- {{ model_summary | default("", true) }} -->
@@ -29,7 +32,6 @@ language:
29
  - **Pre-training Approach:** ConPrompt
30
 
31
  <!-- Provide the basic links for the model. -->
32
-
33
  - **ConPrompt Repository:** https://github.com/youngwook06/ConPrompt
34
  - **ConPrompt Paper:** https://aclanthology.org/2023.findings-emnlp.731/
35
 
@@ -53,27 +55,3 @@ While these behavior can lead to social good e.g., constructing training data fo
53
  **We strongly emphasize the need for careful handling to prevent unintentional misuse and warn against malicious exploitation of such behaviors.**
54
 
55
 
56
- ## Citation
57
-
58
- **BibTeX:**
59
-
60
- @inproceedings{kim-etal-2023-conprompt,
61
- title = "{C}on{P}rompt: Pre-training a Language Model with Machine-Generated Data for Implicit Hate Speech Detection",
62
- author = "Kim, Youngwook and
63
- Park, Shinwoo and
64
- Namgoong, Youngsoo and
65
- Han, Yo-Sub",
66
- editor = "Bouamor, Houda and
67
- Pino, Juan and
68
- Bali, Kalika",
69
- booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2023",
70
- month = dec,
71
- year = "2023",
72
- address = "Singapore",
73
- publisher = "Association for Computational Linguistics",
74
- url = "https://aclanthology.org/2023.findings-emnlp.731",
75
- doi = "10.18653/v1/2023.findings-emnlp.731",
76
- pages = "10964--10980",
77
- abstract = "Implicit hate speech detection is a challenging task in text classification since no explicit cues (e.g., swear words) exist in the text. While some pre-trained language models have been developed for hate speech detection, they are not specialized in implicit hate speech. Recently, an implicit hate speech dataset with a massive number of samples has been proposed by controlling machine generation. We propose a pre-training approach, ConPrompt, to fully leverage such machine-generated data. Specifically, given a machine-generated statement, we use example statements of its origin prompt as positive samples for contrastive learning. Through pre-training with ConPrompt, we present ToxiGen-ConPrompt, a pre-trained language model for implicit hate speech detection. We conduct extensive experiments on several implicit hate speech datasets and show the superior generalization ability of ToxiGen-ConPrompt compared to other pre-trained models. Additionally, we empirically show that ConPrompt is effective in mitigating identity term bias, demonstrating that it not only makes a model more generalizable but also reduces unintended bias. We analyze the representation quality of ToxiGen-ConPrompt and show its ability to consider target group and toxicity, which are desirable features in terms of implicit hate speeches.",
78
- }
79
-
 
8
 
9
  # Model Card for ToxiGen-ConPrompt
10
 
11
+ **ToxiGen-ConPrompt** is a pre-trained language model for implicit hate speech detection.
12
+ The model is pre-trained on a machine-generated dataset for implicit hate speech detection (i.e., *ToxiGen*) using our proposing pre-training approach (i.e., *ConPrompt*).
13
+
14
  <!-- Provide a quick summary of what the model is/does. -->
15
 
16
  <!-- {{ model_summary | default("", true) }} -->
 
32
  - **Pre-training Approach:** ConPrompt
33
 
34
  <!-- Provide the basic links for the model. -->
 
35
  - **ConPrompt Repository:** https://github.com/youngwook06/ConPrompt
36
  - **ConPrompt Paper:** https://aclanthology.org/2023.findings-emnlp.731/
37
 
 
55
  **We strongly emphasize the need for careful handling to prevent unintentional misuse and warn against malicious exploitation of such behaviors.**
56
 
57