Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1104

phi-3-small-128k MATH Lvl 5 is 0

#897

by huu-ontocord - opened Aug 24, 2024

Discussion

huu-ontocord

Aug 24, 2024

this looks wrong. something to do with formatting the evals?

alozowski

Open LLM Leaderboard org Aug 28, 2024

Hi @huu-ontocord ,

MATH Lvl 5 is a challenging benchmark for many models, I suppose this may be the case with microsoft/Phi-3-small-128k-instruct as well. To double-check the results, I re-ran the evaluation with the last commit and the numbers remained the same.

You can investigate the results here, in the Details, if you want to. Plus, this discussion can be also helpful, I described there how to access a details dataset.

Feel free to open a new discussion in case of any questions!

alozowski changed discussion status to closed Aug 28, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment