Bias/Fairness evaluation unclear

#3
by kmargatina - opened

How can I reproduce the results for the bias/fairness evaluation? It is not clear from the paper how you cast CrowS-Pairs, WinoGender and WinoBias as classification tasks. Did you use specific templates for these tasks?

"For each dataset, we evaluate between 5 and 10 prompts.": What does this mean?

Thank you in advance!

BigScience Workshop org

Hi @kmargatina ,

You will find the prompts used for the bias&fairness evaluations directly on promptsource (https://github.com/bigscience-workshop/promptsource).
If you want to limit variance and risk of version mismatch with the numbers reported in the card, I would recommend taking the v0.1 or v0.2 version of the repo.

Victor

Thank you Victor!

VictorSanh changed discussion status to closed

Sign up or log in to comment