phi-3-small-128k MATH Lvl 5 is 0

#897
by huu-ontocord - opened

this looks wrong. something to do with formatting the evals?

Open LLM Leaderboard org

Hi @huu-ontocord ,

MATH Lvl 5 is a challenging benchmark for many models, I suppose this may be the case with microsoft/Phi-3-small-128k-instruct as well. To double-check the results, I re-ran the evaluation with the last commit and the numbers remained the same.

You can investigate the results here, in the Details, if you want to. Plus, this discussion can be also helpful, I described there how to access a details dataset.

Feel free to open a new discussion in case of any questions!

alozowski changed discussion status to closed

Sign up or log in to comment