Muennighoff's picture
Organize eval
8634ce5
dataset,prompt,metric,value
xcopa_zh,C1 or C2? premise_zhht,accuracy,0.55
xcopa_zh,best_option_zhht,accuracy,0.67
xcopa_zh,cause_effect_zhht,accuracy,0.79
xcopa_zh,i_am_hesitating_zhht,accuracy,0.77
xcopa_zh,plausible_alternatives_zhht,accuracy,0.75
xcopa_zh,median,accuracy,0.75
xstory_cloze_zh,Answer Given options_zhht,accuracy,0.7054930509596293
xstory_cloze_zh,Choose Story Ending_zhht,accuracy,0.7948378557246857
xstory_cloze_zh,Generate Ending_zhht,accuracy,0.6366644606221046
xstory_cloze_zh,Novel Correct Ending_zhht,accuracy,0.7782925215089345
xstory_cloze_zh,Story Continuation and Options_zhht,accuracy,0.771012574454004
xstory_cloze_zh,median,accuracy,0.771012574454004
xwinograd_zh,Replace_zhht,accuracy,0.5178571428571429
xwinograd_zh,True or False_zhht,accuracy,0.5218253968253969
xwinograd_zh,does underscore refer to_zhht,accuracy,0.4662698412698413
xwinograd_zh,stand for_zhht,accuracy,0.49404761904761907
xwinograd_zh,underscore refer to_zhht,accuracy,0.44047619047619047
xwinograd_zh,median,accuracy,0.49404761904761907
multiple,average,multiple,0.6716867311672077