Spaces:
Running
on
CPU Upgrade
Why are there no OpenAI models here? we need GPT-3.5 and GPT4 to compare!
Why are there no OpenAI models here? we need GPT-3.5 and GPT4 to compare!
I definitely agree that they should put closed source llms on the leaderboard. Just give them a unique background color to differentiate them. I understand that the official benchmark results for closed source models might be different from what huggingface might get, but it shouldn't be that big of a difference, and that does not justify not having them on the leaderboard.
Hi! We won't do this, as this is a leaderboard for Open models, both for philosophical reasons (openness is cool) and for practical reasons: we want to ensure that the results we display are accurate and reproducible, but 1) commercial closed models can change their API thus rendering any scoring at a given time incorrect 2) we re-run everything on our cluster to ensure all models are run on the same setup and you can't do that for these models.