Spaces:
Running
on
CPU Upgrade
Add column "Added on" or "Last benchmarked" with date?
Since existing models needed to be re-benchmarked following the MMLU blog post, the model queue has grown very large, and it makes the board seem old and stale.
Could a column with a timestamp be added, that either shows when a model was added to the board, and/or when it was last benchmarked?
Then we could sort by the "added on" column and see the new entries to the board, even if they're not near the top. And a "last benchmarked" column would show what models have been re-benchmarked following the MMLU post.
We are going to update all models with the new version of MMLU in one go, and we'll add the model hash too - but it's a good idea to add the evaluated on
column, I'll see what we can do!
Hi!
The leaderboard has been updated, all models displayed now have the correct MMLU score, and we added a column with the model hash!
You can also now know when a model was evaluated by looking at the date when its results were committed to our evaluation dataset here.