Spaces:
Running
Running
name,zero_naive,zero_self_con,zero_cot,zero_cot_self_con,few_naive,few_self_con,few_cot,few_cot_self_con | |
Baichuan2-13B-Chat,37.5,40.0,47.5,52.5,37.5,37.5,42.5,45.0 | |
ChatGLM3-6B,35.0,35.0,50.0,50.0,47.5,47.5,45.0,45.0 | |
DevOps-Model-14B-Chat,35.0,27.5,37.5,52.5,50.0,50.0,55.0,62.5 | |
ERNIE-Bot-4.0,57.5,57.5,60.0,60.0,52.5,52.5,57.5,57.5 | |
GPT-3.5-turbo,50.0,47.5,55.0,55.0,40.0,40.0,50.0,55.0 | |
Gpt4,57.5,57.5,57.5,57.5,52.5,52.5,62.5,62.5 | |
InternLM2-Chat-20B,47.5,47.5,,,47.5,47.5,, | |
InternLM2-Chat-7B,60.0,60.0,57.5,57.5,55.0,55.0,62.5,62.5 | |
LLaMA-2-13B,42.5,42.5,50.0,50.0,50.0,50.0,42.5,42.5 | |
LLaMA-2-70B-Chat,0.0,0.0,57.5,57.5,25.0,25.0,45.0,45.0 | |
LLaMA-2-7B,32.5,32.5,45.0,45.0,45.0,45.0,45.0,45.0 | |
Mistral-7B,0.0,0.0,37.5,37.5,20.0,20.0,50.0,50.0 | |
Qwen-14B-Chat,47.5,45.0,50.0,47.5,50.0,47.5,55.0,57.5 | |
Qwen-72B-Chat,50.0,50.0,47.5,47.5,45.0,45.0,60.0,60.0 | |
Yi-34B-Chat,55.0,55.0,60.0,67.5,50.0,50.0,52.5,55.0 | |
Claude-3-Opus,72.85714285714286,72.85714285714286,,,,,, | |
Gemma_2B,37.5,37.5,40.0,40.0,32.5,32.5,40.0,40.0 | |
Gemma_7B,32.5,32.5,62.5,62.5,40.0,40.0,50.0,50.0 | |
Meta-Llama-3-8B-Instruct,52.85714285714286,52.85714285714286,47.14285714285714,47.14285714285714,52.85714285714286,52.85714285714286,30.0,30.0 | |
Qwen1.5-14B-Base,47.5,47.5,50.0,50.0,47.5,47.5,45.0,45.0 | |
Qwen1.5-14B-Chat,45.0,47.5,60.0,72.5,52.5,55.0,60.0,60.0 | |