llm-assessments / config.py
mmahesh873's picture
init commmit
0603b09
raw
history blame
3.99 kB
other_info_dict = {
"data_description": "We perform the LLM assessment with the SQuAD2.0 validation dataset, where SQuAD stands for Stanford Question Answering Dataset. The dataset is available at https://rajpurkar.github.io/SQuAD-explorer/. There are 12k points in the dataset. Each data point of the SQuAD2.0 validation dataset consists of a question, a context, a topic, and plausible answers to the question. Answers are empty if the information to answer the question is not contained in the context.",
"ProbTypos_description" : "Typo perturber adds typing mistakes (Typo) to the input question. Typo perturber has two parameters: probability of a typo in a word and maximum typos per word. We evaluated the robustness with respect to probability of a typo in a word parameter (level indicator) while keeping maximum typos per word fixed. Levels for the line ‘Probability of a typo in a word’ are defined by tens of percents: level 1 = 10%, level 2 = 20%, level 3 = 30% and so on. Maximum typos per word equals 1 everywhere. We evaluated the robustness on a level 1, 3, 5.",
"MaxTypo_description" : "We use the Typo perturber as detailed above, however we evaluated the robustness with respect to maximum typos per word parameter (level indicator) while keeping probability of a typo in a word fixed. Levels for the line ‘Maximum typos per word’ are defined by: level 1 = 1, level 2 = 2, level 3 = 3 and so on. Probability of a typo in a word equals 10% everywhere. We evaluated the robustness on a level 1, 3, 5.",
"ethnicity_categories_text": """
Datapoints are categorized based on specific keywords appearing in the text, with the following list outlining the considered categories and their respective keywords.
Hispanic or Latino category: “mexican”, “puerto rican”, “cuban”, “dominican”, “central american”, “south american”, “spanish”, “latin”, “latino”, “latinx”, “hispanic”, “chican”, “spanish-speaking”.
White category: “german”, “irish”, “english”, “italian”, “polish”, “french”, “scottish”, “scandinavian”, “slavic”, “caucasian”, “euro-american”, “western”, “white\”.
Black or African American category: “african”, “caribbean”, “west indian”, “somali”, “nigerian”, “ethiopian”, “african american”, “haitian”, “black”, “afro”, “afro-american”, “african american”, “person of color”.
Native Hawaiian or Pacific Islander category: “hawaii”, “native hawaiian”, “samoan”, “guamanian”, “chamorro”, “fijian”, “tongan”, “maori”, “polynesian”, “micronesian”, “pacific islander”, “polynesian”, “micronesian”, “native hawaiian”.
Asian category: “chinese”, “filipino”, “asian indian”, “vietnamese”, “korean”, “japanese”, “thai”, “indonesian”, “burmese”, “pakistani”, “asian”, “east asian”, “south asian”, “southeast asian”.
Native American or Alaska Native category: “cherokee”, “navajo”, “sioux”, “chippewa”, “choctaw”, “lumbee”, “inupiat”, “yupik”, “aleut”, “native american”, “american indian”, “first nations”, “indigenous”, “alaska native”, “tribal”.
Two or more category: , if words related to more than one above-mentioned categories exist in a text.
None category: if none of the above-mentioned words related to categories exist.
""",
"gender_categories_text": """
Only male category, if the input text contains pronouns ‘he’, ‘his’, ‘him’, ‘himself’,
Only female category, if the input text contains pronouns ‘she’, ‘hers’, ‘her’, ‘herself’,
Either both or none category, if the input text contains pronouns from both the Only male and Only female categories or none of above-mentioned pronouns.
""",
}