dataset,prompt,metric,value xcopa_id,C1 or C2? premise_idmt,accuracy,0.55 xcopa_id,best_option_idmt,accuracy,0.69 xcopa_id,cause_effect_idmt,accuracy,0.87 xcopa_id,i_am_hesitating_idmt,accuracy,0.75 xcopa_id,plausible_alternatives_idmt,accuracy,0.79 xcopa_id,median,accuracy,0.75 xcopa_sw,C1 or C2? premise_swmt,accuracy,0.53 xcopa_sw,best_option_swmt,accuracy,0.58 xcopa_sw,cause_effect_swmt,accuracy,0.6 xcopa_sw,i_am_hesitating_swmt,accuracy,0.61 xcopa_sw,plausible_alternatives_swmt,accuracy,0.58 xcopa_sw,median,accuracy,0.58 xcopa_ta,C1 or C2? premise_tamt,accuracy,0.62 xcopa_ta,best_option_tamt,accuracy,0.56 xcopa_ta,cause_effect_tamt,accuracy,0.63 xcopa_ta,i_am_hesitating_tamt,accuracy,0.69 xcopa_ta,plausible_alternatives_tamt,accuracy,0.67 xcopa_ta,median,accuracy,0.63 xcopa_vi,C1 or C2? premise_vimt,accuracy,0.62 xcopa_vi,best_option_vimt,accuracy,0.76 xcopa_vi,cause_effect_vimt,accuracy,0.77 xcopa_vi,i_am_hesitating_vimt,accuracy,0.81 xcopa_vi,plausible_alternatives_vimt,accuracy,0.76 xcopa_vi,median,accuracy,0.76 xcopa_zh,C1 or C2? premise_zhmt,accuracy,0.7 xcopa_zh,best_option_zhmt,accuracy,0.71 xcopa_zh,cause_effect_zhmt,accuracy,0.82 xcopa_zh,i_am_hesitating_zhmt,accuracy,0.82 xcopa_zh,plausible_alternatives_zhmt,accuracy,0.83 xcopa_zh,median,accuracy,0.82 xstory_cloze_ar,Answer Given options_armt,accuracy,0.9219060225016545 xstory_cloze_ar,Choose Story Ending_armt,accuracy,0.9245532759761748 xstory_cloze_ar,Generate Ending_armt,accuracy,0.6730641958967571 xstory_cloze_ar,Novel Correct Ending_armt,accuracy,0.9179351422898743 xstory_cloze_ar,Story Continuation and Options_armt,accuracy,0.913302448709464 xstory_cloze_ar,median,accuracy,0.9179351422898743 xstory_cloze_es,Answer Given options_esmt,accuracy,0.9298477829252151 xstory_cloze_es,Choose Story Ending_esmt,accuracy,0.9444076770350761 xstory_cloze_es,Generate Ending_esmt,accuracy,0.7365982792852416 xstory_cloze_es,Novel Correct Ending_esmt,accuracy,0.928524156187955 xstory_cloze_es,Story Continuation and Options_esmt,accuracy,0.928524156187955 xstory_cloze_es,median,accuracy,0.928524156187955 xstory_cloze_eu,Answer Given options_eumt,accuracy,0.8405029781601588 xstory_cloze_eu,Choose Story Ending_eumt,accuracy,0.8424884182660489 xstory_cloze_eu,Generate Ending_eumt,accuracy,0.6578424884182661 xstory_cloze_eu,Novel Correct Ending_eumt,accuracy,0.8272667107875579 xstory_cloze_eu,Story Continuation and Options_eumt,accuracy,0.8014559894109861 xstory_cloze_eu,median,accuracy,0.8272667107875579 xstory_cloze_hi,Answer Given options_himt,accuracy,0.8623428193249504 xstory_cloze_hi,Choose Story Ending_himt,accuracy,0.8808735936465917 xstory_cloze_hi,Generate Ending_himt,accuracy,0.6591661151555261 xstory_cloze_hi,Novel Correct Ending_himt,accuracy,0.8590337524818001 xstory_cloze_hi,Story Continuation and Options_himt,accuracy,0.871608206485771 xstory_cloze_hi,median,accuracy,0.8623428193249504 xstory_cloze_id,Answer Given options_idmt,accuracy,0.913964262078094 xstory_cloze_id,Choose Story Ending_idmt,accuracy,0.9040370615486433 xstory_cloze_id,Generate Ending_idmt,accuracy,0.7028457974851092 xstory_cloze_id,Novel Correct Ending_idmt,accuracy,0.900727994705493 xstory_cloze_id,Story Continuation and Options_idmt,accuracy,0.8954334877564527 xstory_cloze_id,median,accuracy,0.900727994705493 xstory_cloze_zh,Answer Given options_zhmt,accuracy,0.9040370615486433 xstory_cloze_zh,Choose Story Ending_zhmt,accuracy,0.9192587690271343 xstory_cloze_zh,Generate Ending_zhmt,accuracy,0.686962276637988 xstory_cloze_zh,Novel Correct Ending_zhmt,accuracy,0.913302448709464 xstory_cloze_zh,Story Continuation and Options_zhmt,accuracy,0.9033752481800132 xstory_cloze_zh,median,accuracy,0.9040370615486433 xwinograd_fr,Replace_frmt,accuracy,0.5662650602409639 xwinograd_fr,True or False_frmt,accuracy,0.4819277108433735 xwinograd_fr,does underscore refer to_frmt,accuracy,0.5301204819277109 xwinograd_fr,stand for_frmt,accuracy,0.5060240963855421 xwinograd_fr,underscore refer to_frmt,accuracy,0.5662650602409639 xwinograd_fr,median,accuracy,0.5301204819277109 xwinograd_pt,Replace_ptmt,accuracy,0.5931558935361216 xwinograd_pt,True or False_ptmt,accuracy,0.5095057034220533 xwinograd_pt,does underscore refer to_ptmt,accuracy,0.5475285171102662 xwinograd_pt,stand for_ptmt,accuracy,0.4866920152091255 xwinograd_pt,underscore refer to_ptmt,accuracy,0.5475285171102662 xwinograd_pt,median,accuracy,0.5475285171102662 xwinograd_zh,Replace_zhmt,accuracy,0.5833333333333334 xwinograd_zh,True or False_zhmt,accuracy,0.4801587301587302 xwinograd_zh,does underscore refer to_zhmt,accuracy,0.5714285714285714 xwinograd_zh,stand for_zhmt,accuracy,0.5198412698412699 xwinograd_zh,underscore refer to_zhmt,accuracy,0.6071428571428571 xwinograd_zh,median,accuracy,0.5714285714285714 multiple,average,multiple,0.7521365325222159