open-llm-leaderboard/open_llm_leaderboard · MMLU by task leaderboard

Aug 8, 2023

I created a leaderboard showing the accuracy score for each task in MMLU https://huggingface.co/spaces/CoreyMorris/MMLU-by-task-Leaderboard . I'll keep it updated at least until hugging face decides to create one one with the breakdown by tasks. I'm open to suggestions for improving it.

clefourrier

Open LLM Leaderboard org Aug 8, 2023

This is a great idea! (We probably won't add one here at the moment)

Overall, I would suggest:

removing non MMLU scores
adding some of the original MMLU groupings (humanities, social sciences, STEM, other) (you can find more info on the original repository)
using a bigger widget for the table (it's hard to search in it) and possibly adding a search function.

I really like the plots, you could add some explanation of what you are plotting and why, it would really enrich your page.

Lastly, don't forget your own citation link! :)

CoreyMorris

Aug 8, 2023

Thanks for the suggestions !

CoreyMorris

Aug 10, 2023

@clefourrier

I made the table bigger and added some ways to filter(Model size, model name, and task name)
Also added some explanation for the plotting and my own citation.

I'll probably add the original MMLU groupings as well. Not sure about removing the non MMLU scores. I want people to be able to compare those as well, but I should probably at least have some explanation and maybe have them hidden or less prominent by default.

Thanks again for the feedback !

clefourrier changed discussion status to closed Aug 16, 2023