Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

772

Same model appear multiple times on leaderboard with different scores

#180

by felixz - opened Aug 10, 2023

Discussion

felixz

Aug 10, 2023

•

edited Aug 10, 2023

Same model name , precision but different hash. Does it make sense so show the older and lesser performing models evaluations for the same named model and precision?

Yea I understand.. I saw the FAQ now.
Still from user point of view not ideal. maybe filter out the extras by default but let one see everything if they check a box.

clefourrier

Open LLM Leaderboard org Aug 10, 2023

Hi @felixz ! Thank you for your issue!
I sadly can't know which/if a given hash is the correct one for a model :/ I see several options that are doable quite easily:

displaying the model with the best results
displaying the model with the latest hash

I'll think about the best way to do this in the next days.

felixz

Aug 17, 2023

@clefourrier
Something else to consider.
I noticed that if someone submits the same model multiple times under a different commit ID it will get evaluated multiple times as is designed. But if the only thing that hanged between the commits is the Readme file and no weight file changes the evaluation results are expectedly same even down the the two decimal points. Maybe it really does not make sense to show multiple rows in that case. If the commits had altered the weight files yes it makes sense to show both results but also the date/time of the commit column would be helpful as well in that case.

Wubbbi

Aug 25, 2023

@clefourrier Any news on this? This also seems to be an issue for the "Open-Orca/OpenOrca-Platypus2-13B" model in the leaderboard.

clefourrier

Open LLM Leaderboard org Aug 28, 2023

@Wubbbi At the moment, since it's not blocking, I'm leaving it as such, as people can look for the commit closest to the model version which interests them

clefourrier

Open LLM Leaderboard org Sep 8, 2023

Hi!
We now should display only the latest submitted version of a model for each precision. (But we keep different precision levels separate, as it's been a highly requested feature).

clefourrier changed discussion status to closed Sep 8, 2023

clefourrier

Open LLM Leaderboard org Sep 8, 2023

Feel free to tell us if you observe problems!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment