Spaces:
Running
Running
other_info_dict = { | |
"data_description": "We perform the LLM assessment with the SQuAD2.0 validation dataset, where SQuAD stands for Stanford Question Answering Dataset. The dataset is available at https://rajpurkar.github.io/SQuAD-explorer/. There are 12k points in the dataset. Each data point of the SQuAD2.0 validation dataset consists of a question, a context, a topic, and plausible answers to the question. Answers are empty if the information to answer the question is not contained in the context.", | |
"ProbTypos_description" : "Typo perturber adds typing mistakes (Typo) to the input question. Typo perturber has two parameters: probability of a typo in a word and maximum typos per word. We evaluated the robustness with respect to probability of a typo in a word parameter (level indicator) while keeping maximum typos per word fixed. Levels for the line ‘Probability of a typo in a word’ are defined by tens of percents: level 1 = 10%, level 2 = 20%, level 3 = 30% and so on. Maximum typos per word equals 1 everywhere. We evaluated the robustness on a level 1, 3, 5.", | |
"MaxTypo_description" : "We use the Typo perturber as detailed above, however we evaluated the robustness with respect to maximum typos per word parameter (level indicator) while keeping probability of a typo in a word fixed. Levels for the line ‘Maximum typos per word’ are defined by: level 1 = 1, level 2 = 2, level 3 = 3 and so on. Probability of a typo in a word equals 10% everywhere. We evaluated the robustness on a level 1, 3, 5.", | |
"ethnicity_categories_text": """ | |
Datapoints are categorized based on specific keywords appearing in the text, with the following list outlining the considered categories and their respective keywords. | |
Hispanic or Latino category: “mexican”, “puerto rican”, “cuban”, “dominican”, “central american”, “south american”, “spanish”, “latin”, “latino”, “latinx”, “hispanic”, “chican”, “spanish-speaking”. | |
White category: “german”, “irish”, “english”, “italian”, “polish”, “french”, “scottish”, “scandinavian”, “slavic”, “caucasian”, “euro-american”, “western”, “white\”. | |
Black or African American category: “african”, “caribbean”, “west indian”, “somali”, “nigerian”, “ethiopian”, “african american”, “haitian”, “black”, “afro”, “afro-american”, “african american”, “person of color”. | |
Native Hawaiian or Pacific Islander category: “hawaii”, “native hawaiian”, “samoan”, “guamanian”, “chamorro”, “fijian”, “tongan”, “maori”, “polynesian”, “micronesian”, “pacific islander”, “polynesian”, “micronesian”, “native hawaiian”. | |
Asian category: “chinese”, “filipino”, “asian indian”, “vietnamese”, “korean”, “japanese”, “thai”, “indonesian”, “burmese”, “pakistani”, “asian”, “east asian”, “south asian”, “southeast asian”. | |
Native American or Alaska Native category: “cherokee”, “navajo”, “sioux”, “chippewa”, “choctaw”, “lumbee”, “inupiat”, “yupik”, “aleut”, “native american”, “american indian”, “first nations”, “indigenous”, “alaska native”, “tribal”. | |
Two or more category: , if words related to more than one above-mentioned categories exist in a text. | |
None category: if none of the above-mentioned words related to categories exist. | |
""", | |
"gender_categories_text": """ | |
Only male category, if the input text contains pronouns ‘he’, ‘his’, ‘him’, ‘himself’, | |
Only female category, if the input text contains pronouns ‘she’, ‘hers’, ‘her’, ‘herself’, | |
Either both or none category, if the input text contains pronouns from both the Only male and Only female categories or none of above-mentioned pronouns. | |
""", | |
} |