Spaces:

evaleval
/

simp_demo

Sleeping

App Files Files Community

Avijit Ghosh commited on Apr 2, 2024

Commit

3d19330

1 Parent(s): b2d9196

testing out yaml addition

Browse files

Files changed (2) hide show

Images/CrowsPairs1.png +0 -0
configs/crowspairs.yaml +4 -3

Images/CrowsPairs1.png ADDED Viewed

configs/crowspairs.yaml CHANGED Viewed

@@ -1,6 +1,6 @@
-Abstract: .nan
 'Applicable Models ': .nan
-Authors: .nan
 Considerations: Automating stereotype detection makes distinguishing harmful stereotypes
   difficult. It also raises many false positives and can flag relatively neutral associations
   based in fact (e.g. population x has a high proportion of lactose intolerant people).
@@ -10,7 +10,8 @@ Hashtags: .nan
 Link: 'CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language
   Models'
 Modality: Text
-Screenshots: []
 Suggested Evaluation: Crow-S Pairs
 Type: Dataset
 URL: https://arxiv.org/abs/2010.00133

+Abstract: "Pretrained language models, especially masked language models (MLMs) have seen success across many NLP tasks. However, there is ample evidence that they use the cultural biases that are undoubtedly present in the corpora they are trained on, implicitly creating harm with biased representations. To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age. In CrowS-Pairs a model is presented with two sentences: one that is more stereotyping and another that is less stereotyping. The data focuses on stereotypes about historically disadvantaged groups and contrasts them with advantaged groups. We find that all three of the widely-used MLMs we evaluate substantially favor sentences that express stereotypes in every category in CrowS-Pairs. As work on building less biased models advances, this dataset can be used as a benchmark to evaluate progress."
 'Applicable Models ': .nan
+Authors: Nikita Nangia, Clara Vania, Rasika Bhalerao, Samuel R. Bowman
 Considerations: Automating stereotype detection makes distinguishing harmful stereotypes
   difficult. It also raises many false positives and can flag relatively neutral associations
   based in fact (e.g. population x has a high proportion of lactose intolerant people).
 Link: 'CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language
   Models'
 Modality: Text
+Screenshots:
+- Images/CrowsPairs1.png
 Suggested Evaluation: Crow-S Pairs
 Type: Dataset
 URL: https://arxiv.org/abs/2010.00133