Creating functions for plotting results over time

#295
by chriscanal - opened

I've added two graphs and human baselines for each metric. I think this should help us track progress over time more easily.

Open LLM Leaderboard with Graphs.png

Once this is approved there is a minor change to app.py that I will open a PR for in order to get these graphs to actually display

Open LLM Leaderboard org

Hi, thanks for this contribution, it looks great !! I think however that it would be better to have it displayed in another tab (like the about section). What do you think ?

Sure, hows this?
image.png

Any other changes I should make?

Open LLM Leaderboard org

Hi ! I just looked at the changes locally. Looks great :)
Is there however, a specific reason why you chose to use non-interactive plots (instead of plotly) ?

@SaylorTwift I didn't use plotly because I'm dumb and ignorant. lmao. I tried it out, and it looks soooo much better!

Screenshot 2023-09-26 at 9.55.57 AM.png

Great idea!

Open LLM Leaderboard org

Oh my god it looks so good !! I will try it out locally asap, I think I spotted a reordering of the models in the leaderboard using your PR, can you maybe check that ?

Open LLM Leaderboard org

Yes, that's what I thought, the models are reordered. I do not really have time to look into it right now but tell me when the issue is solved !

Screenshot 2023-09-27 at 14.34.24.png

Oops. yeah, I messed that up. I was reordering by date uploaded to figure out the timeline of the scores. I made a copy of the original df to make sure I don't modify the order.
image.png

Anything else I can do to get this merged?

Open LLM Leaderboard org

Hi!
This looks super neat, thank you for your work!

I have a small nit; the tab name is unclear atm, would be good to rename it to something else, maybe "Metrics evolution through time" for example.
Do you also display the scores of flagged models? If yes I don't think they should be included in the graph.

But congrats, it looks very cool, looking forward to having it merged!

I updated based on your request @clefourrier . I unfortunately have no way of fixing conflicts. I'm using the huggingface web ui to write the code, and I can't run locally because my local doesn't have permissions to run the code anymore due to meta-llama/Llama-2-70b-chat-hf permissions. I think the conflicts would be very minor and take max 30 seconds to fix though looking at whats currently in main.

Open LLM Leaderboard org

I merged main to your branch to fix the issues :) (info on how to do it in the cmd line is here for a possible next time).

Thank you very much for your work!!!

clefourrier changed pull request status to merged
Open LLM Leaderboard org

Hi @chriscanal
FYI we deactivated your cool tab for now, because we are updating the front to make it more maintainable, but we'll add it back as soon as the front end is upgraded (ETA next week most likely) ๐Ÿค—

Sign up or log in to comment