Spaces:
Running
on
CPU Upgrade
Our models failed - Can we get the logs?
Hello HF Team !
We are experiencing some trouble to push our models for evaluation on the open_llm_leaderboard.
We have pushed several 70B models, but none of them seems to get through the evaluation, apparently due to a network issue.
(see : https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/675)
Is it the network issue again that's preventing our model to pass the evaluation, or something else ?
The concerned models are :
https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/paloalma/ECE-TW3-JRGL-V1_eval_request_False_float16_Original.json
https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/paloalma/ECE-TW3-JRGL-V2_eval_request_False_bfloat16_Original.json
https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/paloalma/ECE-TW3-JRGL-V3_eval_request_False_bfloat16_Original.json
https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/paloalma/ECE-TW3-JRGL-V4_eval_request_False_bfloat16_Original.json
Thanks for your time ;) !
Hi @paloalma !
Hmm, it seems like the problem remains the same as it was in the previous discussion. As we checked, the problem is on our side. I rescheduled all the models again and we will investigate this network problem in parallel.
I will close this issue now as the problem should be fixed from our side, if the status is FAILED
again, please reopen this issue :)
Hi,
@alozowski it seems the models have the status FAILED again, could you please check what is going on as it is the third time now ? So we can adapt.
Thanks,
André-Louis