Evaluation disappeared!

#646
by david-infinimol - opened

Hi,

I submitted an evaluation for https://huggingface.co/Infinimol/miiqu-f16 about a week ago.

It was still in the evaluation phase for about 6 days, and now its gone, but it's not in the evaluation list. What happened? Should I resubmit?

This comment has been hidden
Open LLM Leaderboard org

Hi!
It's likely that your evaluation failed! Please follow the steps in the FAQ so we can investigate.

Happy too, but I read the FAQ, and it's not apparent what needs to be done next. The same goes for the 'About' section (ps. there's a spelling mistake for "RESSOURCES", it's "resources").

The only information I found was this:
https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/ab9ff3919d7cad250562569f35a479450d79f576
not super helpful.

My model is quite big (105B params), but I did see a larger model in the list of completed evaluations (https://huggingface.co/NeverSleep/MiquMaid-v2-2x70B-DPO?not-for-all-audiences=true)
This larger model used BF16, and is the only big model I could find. Is BF16 important?

Open LLM Leaderboard org

Hi!
Thanks for pointing to the request file, which is what we need to investigate. I checked the logs, and your model failed to download as the connection timed out - I passed it to pending again.

We don't guarantee that models this big will get evaluated properly though - we use a combination of data and pipeline parallelism to distribute big models but sometimes they just barely fit on one node of our cluster. bf16 vs f16 should not affect this that much.

Closing the issue, feel free to reopen if there is another problem.

clefourrier changed discussion status to closed

Just a quick check please: is it running now or stalled in the download?

Last time it probably block a server for a week. Could someone check it please?

@dnhkng Just giving you a heads up that miiqu-f16 is on the leaderboard, but you have to unselect "contains a merge". Also, the other model you mentioned (miqu-1-70b-sf) can be shown by unselecting "private or deleted". Perhaps this was set by the uploader.

Sign up or log in to comment