added yaml fields to all files

Files changed (11) hide show

configs/crowspairs.yaml CHANGED Viewed

@@ -17,4 +17,6 @@ Suggested Evaluation: Crow-S Pairs
 Level: Dataset
 URL: https://arxiv.org/abs/2010.00133
 What it is evaluating: Protected class stereotypes
-Metrics: .nan

 Level: Dataset
 URL: https://arxiv.org/abs/2010.00133
 What it is evaluating: Protected class stereotypes
+Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/honest.yaml CHANGED Viewed

@@ -15,3 +15,5 @@ Level: Output
 URL: https://aclanthology.org/2021.naacl-main.191.pdf
 What it is evaluating: Protected class stereotypes and hurtful language
 Metrics: .nan

 URL: https://aclanthology.org/2021.naacl-main.191.pdf
 What it is evaluating: Protected class stereotypes and hurtful language
 Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/ieat.yaml CHANGED Viewed

@@ -14,4 +14,6 @@ Suggested Evaluation: Image Embedding Association Test (iEAT)
 Level: Model
 URL: https://dl.acm.org/doi/abs/10.1145/3442188.3445932
 What it is evaluating: Embedding associations
-Metrics: .nan

 Level: Model
 URL: https://dl.acm.org/doi/abs/10.1145/3442188.3445932
 What it is evaluating: Embedding associations
+Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/imagedataleak.yaml CHANGED Viewed

@@ -13,4 +13,6 @@ Suggested Evaluation: Dataset leakage and model leakage
 Level: Dataset
 URL: https://arxiv.org/abs/1811.08489
 What it is evaluating: Gender and label bias
-Metrics: .nan

 Level: Dataset
 URL: https://arxiv.org/abs/1811.08489
 What it is evaluating: Gender and label bias
+Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/measuringforgetting.yaml CHANGED Viewed

@@ -17,4 +17,6 @@ Suggested Evaluation: Measuring forgetting of training examples
 Level: Model
 URL: https://arxiv.org/pdf/2207.00099.pdf
 What it is evaluating: Measure whether models forget training examples over time, over different types of models (image, audio, text) and how order of training affects privacy attacks
-Metrics: .nan

 Level: Model
 URL: https://arxiv.org/pdf/2207.00099.pdf
 What it is evaluating: Measure whether models forget training examples over time, over different types of models (image, audio, text) and how order of training affects privacy attacks
+Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/notmyvoice.yaml CHANGED Viewed

@@ -14,4 +14,6 @@ Suggested Evaluation: Not My Voice! A Taxonomy of Ethical and Safety Harms of Sp
 Level: Taxonomy
 URL: https://arxiv.org/pdf/2402.01708.pdf
 What it is evaluating: Lists harms of audio/speech generators
-Metrics: .nan

 Level: Taxonomy
 URL: https://arxiv.org/pdf/2402.01708.pdf
 What it is evaluating: Lists harms of audio/speech generators
+Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/palms.yaml CHANGED Viewed

@@ -12,4 +12,6 @@ Suggested Evaluation: Human and Toxicity Evals of Cultural Value Categories
 Level: Output
 URL: http://arxiv.org/abs/2106.10328
 What it is evaluating: Adherence to defined norms for a set of cultural categories
-Metrics: .nan

 Level: Output
 URL: http://arxiv.org/abs/2106.10328
 What it is evaluating: Adherence to defined norms for a set of cultural categories
+Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/safelatentdiff.yaml CHANGED Viewed

@@ -15,4 +15,6 @@ Suggested Evaluation: Evaluating text-to-image models for safety
 Level: Output
 URL: https://arxiv.org/pdf/2211.05105.pdf
 What it is evaluating: Generating images for diverse set of prompts (novel I2P benchmark) and investigating how often e.g. violent/nude images will be generated. There is a distinction between implicit and explicit safety, i.e. unsafe results with “normal” prompts.
-Metrics: .nan

 Level: Output
 URL: https://arxiv.org/pdf/2211.05105.pdf
 What it is evaluating: Generating images for diverse set of prompts (novel I2P benchmark) and investigating how often e.g. violent/nude images will be generated. There is a distinction between implicit and explicit safety, i.e. unsafe results with “normal” prompts.
+Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/stablebias.yaml CHANGED Viewed

@@ -13,3 +13,5 @@ Level: Output
 URL: https://arxiv.org/abs/2303.11408
 What it is evaluating: .nan
 Metrics: .nan

 URL: https://arxiv.org/abs/2303.11408
 What it is evaluating: .nan
 Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/tango.yaml CHANGED Viewed

@@ -17,4 +17,6 @@ Suggested Evaluation: Human and Toxicity Evals of Cultural Value Categories
 Level: Output
 URL: http://arxiv.org/abs/2106.10328
 What it is evaluating: Bias measurement for trans and nonbinary community via measuring gender non-affirmative language, specifically 1) misgendering 2), negative responses to gender disclosure
-Metrics: .nan

 Level: Output
 URL: http://arxiv.org/abs/2106.10328
 What it is evaluating: Bias measurement for trans and nonbinary community via measuring gender non-affirmative language, specifically 1) misgendering 2), negative responses to gender disclosure
+Metrics: .nan
+Affiliations: .nan
+Methodology: .nan

configs/videodiversemisinfo.yaml CHANGED Viewed

@@ -14,4 +14,6 @@ Level: Output
 URL: https://arxiv.org/abs/2210.10026
 What it is evaluating: Human led evaluations of deepfakes to understand susceptibility
   and representational harms (including political violence)
-Metrics: .nan

 URL: https://arxiv.org/abs/2210.10026
 What it is evaluating: Human led evaluations of deepfakes to understand susceptibility
   and representational harms (including political violence)
+Metrics: .nan
+Affiliations: .nan
+Methodology: .nan