is the leaderboard 5 shot or 0 shot?

#2
by kristoph - opened

I noticed the code seem to be a running a 5 shot test so wanted to clarify.

TIGER-Lab org

It's supposed to be 5 shot. However, some commercialized models did report 0 shot result instead of 5 shots. So it can be seen as max(0-shot, 5-shot).

wenhu changed discussion status to closed

Thank you! I am going to run these on a variety of models 0 shot which presumably means somewhat reduced results. My goal is to test different prompting strategies to determine if these have any impact on the result but as that tends to change the output significantly it doesn't make sense to run 5 shot.

( Sorry to reopen the discussion here. )

Sign up or log in to comment