Our submitted model is still not visible on the leaderboard

#129
by Limerobot - opened

Hello.
I can't see the results of upstage/llama-65b-instruct that we submitted before Llama-2 came out. The most recent submission, upstage/Llama-2-70b-instruct-1024, from yesterday also does not appear in the pending list too.

@clefourrier , can you tell me what's going on with our model?

@augtoma Thanks for the information and the links.

@clefourrier

We've verified that our config, model, and tokenizer are all loading properly, and we've passed the tests on the local lm-evaluation-harness. We used the pretrained argument as follows, for example, pretrained="upstage/llama-65b-instruct".
fig1.png

So, it's challenging for us to determine the cause of the failure. Could you provide any information about the errors that may have resulted in our submissions failing?
(Currently our transformers version is 4.31.0)

Open LLM Leaderboard org

@Limerobot we're investigating, thank you for reporting this

Hello @clefourrier . Same case for MayaPH/GodziLLa-30B-instruct. I've also seen that you added it back manually hours ago at https://huggingface.co/datasets/open-llm-leaderboard/requests/tree/main/MayaPH, but the recent job also resulted in failure. I modified the config.json just now. If you could please re-run it again. Thank you.

@clefourrier , Any update on this? I really appreciate your help.

Open LLM Leaderboard org

@jaspercatapang @Limerobot are all your models now on the leaderboard?

@clefourrier Thanks for asking. Still, https://huggingface.co/upstage/Llama-2-70b-instruct-v2 is in the queue.

Also upstage/Llama-2-70b-instruct-1024 and upstage/Llama-2-70b-instruct are duplicates. Feel free to remove upstage/Llama-2-70b-instruct-1024 in the leaderboard.

Open LLM Leaderboard org

If it's in the queue it will be evaluated soon. As for duplicates models I will remove upstage/Llama-2-70b-instruct-1024.
Thanks for your patience !

SaylorTwift changed discussion status to closed

@clefourrier @SaylorTwift Thank you for your hard work. :)

Sign up or log in to comment