OpsEval / data_v2 /bosc_zh_mc_gen.csv
Junetheriver's picture
update 05-15
32e04fa
raw
history blame
1.19 kB
name,zero_naive,zero_self_con,zero_cot,zero_cot_self_con,few_naive,few_self_con,few_cot,few_cot_self_con
Baichuan2-13B-Chat,37.5,40.0,47.5,52.5,37.5,37.5,42.5,45.0
Chatglm3-6B,35.0,35.0,50.0,50.0,47.5,47.5,45.0,45.0
Devops-Model-14B-Chat,35.0,27.5,37.5,52.5,50.0,50.0,55.0,62.5
Ernie-Bot-4.0,57.5,57.5,60.0,60.0,52.5,52.5,57.5,57.5
Gpt-3.5-Turbo,50.0,47.5,55.0,55.0,40.0,40.0,50.0,55.0
GPT-4,57.5,57.5,57.5,57.5,52.5,52.5,62.5,62.5
Internlm2-Chat-20B,47.5,47.5,,,47.5,47.5,,
Internlm2-Chat-7B,60.0,60.0,57.5,57.5,55.0,55.0,62.5,62.5
Llama-2-13B,42.5,42.5,50.0,50.0,50.0,50.0,42.5,42.5
Llama-2-70B-Chat,0.0,0.0,57.5,57.5,25.0,25.0,45.0,45.0
Llama-2-7B,32.5,32.5,45.0,45.0,45.0,45.0,45.0,45.0
Mistral-7B,0.0,0.0,37.5,37.5,20.0,20.0,50.0,50.0
Qwen-14B-Chat,47.5,45.0,50.0,47.5,50.0,47.5,55.0,57.5
Qwen-72B-Chat,50.0,50.0,47.5,47.5,45.0,45.0,60.0,60.0
Yi-34B-Chat,55.0,55.0,60.0,67.5,50.0,50.0,52.5,55.0
gemma_2b,37.5,37.5,40.0,40.0,32.5,32.5,40.0,40.0
gemma_7b,32.5,32.5,62.5,62.5,40.0,40.0,50.0,50.0
Qwen1.5-14B-Base,47.5,52.85714285714286,50.0,47.14285714285714,47.5,52.85714285714286,45.0,30.0
Qwen1.5-14B-Chat,45.0,47.5,60.0,50.0,52.5,47.5,60.0,45.0
Qwen1.5-14B-Chat,,47.5,,72.5,,55.0,,60.0