OpsEval / data_v2 /dfcdata_zh_mc_gen.csv
Junetheriver's picture
update 05-15
32e04fa
raw
history blame
No virus
1.48 kB
name,zero_naive,zero_self_con,zero_cot,zero_cot_self_con,few_naive,few_self_con,few_cot,few_cot_self_con
Baichuan2-13B-Chat,64.79,66.2,68.31,73.24,62.68,64.08,68.31,66.2
Chatglm3-6B,55.63380282,55.63380282,61.97183099,61.97183099,51.4084507,51.4084507,57.04225352,57.04225352
Devops-Model-14B-Chat,33.8,34.51,54.23,56.34,80.99,78.87,51.41,63.38
Ernie-Bot-4.0,81.0,81.0,82.0,82.0,83.0,83.0,85.0,85.0
Gpt-3.5-Turbo,77.46,76.06,82.39,81.69,71.13,73.24,80.28,78.87
GPT-4,85.21,85.21,86.62,86.62,82.39,82.39,90.14,90.14
Internlm2-Chat-20B,74.64788732,74.64788732,74.64788732,74.64788732,78.16901408,78.16901408,,
Internlm2-Chat-7B,76.05633803,76.05633803,73.94366197,73.94366197,74.64788732,74.64788732,57.04225352,57.04225352
Llama-2-13B,45.77,45.77,70.42,70.42,61.97,61.97,61.27,61.27
Llama-2-70B-Chat,14.79,14.79,67.61,67.61,41.55,40.85,72.54,72.54
Llama-2-7B,30.28,30.28,45.77,45.77,45.07,45.07,61.97,61.97
Mistral-7B,2.82,2.82,64.79,64.79,16.9,16.9,64.08,64.08
Qwen-14B-Chat,73.94,73.94,73.24,76.76,76.06,74.65,69.01,71.83
Qwen-72B-Chat,86.62,86.62,83.8,83.8,83.8,83.8,83.8,83.8
Yi-34B-Chat,78.87,80.28,85.92,86.62,86.62,86.62,76.06,85.21
gemma_2b,28.16901,28.16901,38.02817,38.02817,27.46479,27.46479,41.5493,41.5493
gemma_7b,35.91549,35.91549,59.15493,59.15493,50.70423,50.70423,66.90141,66.90141
Qwen1.5-14B-Base,73.23944,73.23944,76.05634,76.05634,81.69014,81.69014,57.04225,57.04225
Qwen1.5-14B-Chat,75.35211,76.05634,80.28169,83.09859,83.80282,80.98592,78.87324,80.98592