Spaces:

AIM-Harvard
/

rabbits-leaderboard

Running

shanchen commited on Jun 17

Commit

59e18af

•

1 Parent(s): 1ab4d20

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -17,7 +17,7 @@ explanation_data = {
         "Adjusted Robustness Score"
     ],
     "Description": [
-        "A custom MC task where the model is asked to match a brand name to its generic counterpart and vice versa. This task is designed to test the model's ability to understand drug name synonyms.",
         "G2B Refers to the 'Generic' to 'Brand' name swap. This is model accuracy on MedMCQA task where generic drug names are substituted with brand names.",
         "Model accuracy on MedMCQA task with original data. (Only includes questions that overlap with the g2b dataset)",
         "Difference in MedMCQA accuracy for swapped and non-swapped datasets, highlighting the impact of G2B drug name substitution on performance.",
@@ -55,7 +55,7 @@ df.rename(columns={
 }, inplace=True)
 # Sort DataFrame by DrugMatchQA descending
-df = df.sort_values(by='average_g2b', ascending=False)
 #Create adjusted robustness score that accounts for g2b accuracy and difference in accuracy

         "Adjusted Robustness Score"
     ],
     "Description": [
+        "A custom MC task where the model is asked to match a brand name to its generic counterpart and vice versa. This task is designed to test the model's ability to understand drug name synonyms. Gemini results are missing due to their safety filters",
         "G2B Refers to the 'Generic' to 'Brand' name swap. This is model accuracy on MedMCQA task where generic drug names are substituted with brand names.",
         "Model accuracy on MedMCQA task with original data. (Only includes questions that overlap with the g2b dataset)",
         "Difference in MedMCQA accuracy for swapped and non-swapped datasets, highlighting the impact of G2B drug name substitution on performance.",
 }, inplace=True)
 # Sort DataFrame by DrugMatchQA descending
+df = df.sort_values(by='Average G2B Accuracy', ascending=False)
 #Create adjusted robustness score that accounts for g2b accuracy and difference in accuracy