Spaces:
Running
on
CPU Upgrade
Top Performance Over Time
As we approach and exceed human level performance on many these benchmarks, it would be nice to track top scores on each benchmark over time. This data would look best on a graph with the y axis being a particular score (default to average), while the x axis could be time. I think this would be helpful for tracking our progress as a community. Are others interested? I might be able to find time to contribute and make this happen if so. I was thinking of something that looks like this (fake data):
That sounds like a very cool idea, if you create a space for it I'll link it in the Resources
discussion :)
Hi
@chriscanal
,
I'll close this issue for now, feel free to reopen it if you find the time to make a POC :)
I've opened a PR for this https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/295
This is very cool work, looking forward to either merging it or seeing in its own space :)
Closing since this has been merged, thank you again for this work!