Please submit to HF Leaderboard

#2
by eNavarro - opened

I've recently read a comparison of LLM models, and this one (version Q2_K) sits right on top, next to GPT-4.
Comparison is here: https://www.reddit.com/r/LocalLLaMA/comments/17vcr9d/llm_comparisontest_2x_34b_yi_dolphin_nous/

Impressive as that may sound, the test is still some home made crafted test with a particular purpose in mind, not an all-round test. I'd love to see this model compared to others in HuggingFace Leaderboard: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

Thanks!

I submitted the model for evaluation a few days ago. Checked back again now and it wasn't in the pending section, not even any reports of failing. Did they remove it?

Well, I added it again.

I submitted the model for evaluation a few days ago. Checked back again now and it wasn't in the pending section, not even any reports of failing. Did they remove it?

Well, I added it again.

It's just too good

Update: It's still not appearing in the pending, evaluating, or finished categories. When I try to re-submit, it says it's already submitted. No clue what's happening.

The elites don't want regular people to know about the infinite POWER of frankenmerges!

Hi!
For these kind of problems, it would be good to open an issue on the leaderboard's discussions next time, so we can give you a hand :)
We have an FAQ : if a model stops appearing in the queues, it usually means that evaluating it failed.

I checked for your specific model, and that's indeed what happened (connection problem when downloading the weights, it happens sometimes).
It's been relaunched, we'll see if it fits in memory - if it does, you'll get results in a couple of days.

Hi!
For these kind of problems, it would be good to open an issue on the leaderboard's discussions next time, so we can give you a hand :)
We have an FAQ : if a model stops appearing in the queues, it usually means that evaluating it failed.

I checked for your specific model, and that's indeed what happened (connection problem when downloading the weights, it happens sometimes).
It's been relaunched, we'll see if it fits in memory - if it does, you'll get results in a couple of days.

Is it solved? I still cannot find goliath-120b on the leaderboard. Really looking forward to see its results.

@clefourrier @HuggingFaceH4 WHERE ARE THE RESULTS?

Hi @silver !

No, this model does not fit in the memory of our GPUs, we'll need to adapt our backend to run multi-node evaluation automatically to be able to run it.
The best would be for you to open an issue on the Open LLM Leaderboard page directly, so we can track this and make sure it stays in our backlog.

Btw, as indicated in the FAQ, you can follow the status of your models by looking at their request file.

Sign up or log in to comment