Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
phi-3-small-128k MATH Lvl 5 is 0
#897
by
huu-ontocord
- opened
this looks wrong. something to do with formatting the evals?
Hi @huu-ontocord ,
MATH Lvl 5 is a challenging benchmark for many models, I suppose this may be the case with microsoft/Phi-3-small-128k-instruct
as well. To double-check the results, I re-ran the evaluation with the last commit and the numbers remained the same.
You can investigate the results here, in the Details
, if you want to. Plus, this discussion can be also helpful, I described there how to access a details dataset.
Feel free to open a new discussion in case of any questions!
alozowski
changed discussion status to
closed