Does Yi models have eval issues?

#393
by migtissera - opened

Hey! My Tess-Medium-200K-v1.0 model (renamed to Tess-M-Creative-v1.0) have been running for 2 days now. Is there an error? It hasn't failed, it's still in the "Running Evaluation" queue.

This is the model: https://huggingface.co/migtissera/Tess-M-Creative-v1.0

Open LLM Leaderboard org
edited Nov 21, 2023

Hi!
Could you please link to the request file? There are many reasons why this could have happened :)

Open LLM Leaderboard org

The specific model you linked was cancelled (because a more important job was launched on the cluster) and automatically requeued - we're apparently still having a small issue with our display.

Open LLM Leaderboard org

Hi ! Your model actually failed because of network error on our side, I will re-add it to the queue, thanks for your patience :)

Thank you @SaylorTwift

Open LLM Leaderboard org
edited Nov 22, 2023

Hi @migtissera ,
You'll find the request files for your models here. If you point to them next time, we'll be able to debug your problems faster, as they contain the job id which allows us to look at the logs :)
The first model failed because the launching system had a problem launching this big a model (we will need to launch it on multiple nodes).
The other two were cancelled for priority reasons, I'll let @SaylorTwift tell you if they were rescheduled (and relaunch them if not) - the cluster is very full at the moment though, so it might take some time before they are evaluated.

Thanks @clefourrier ! I didn't know where to find this.. Will do this from next one onwards.. :)

Open LLM Leaderboard org

I'm going to close this issue for now, feel free to reopen if needed

clefourrier changed discussion status to closed

Sign up or log in to comment