Spaces:

HuggingFaceH4
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

710

Inclusion of non open LLMs for straightforward comparison

by Supreeth - opened May 11, 2023

Discussion

Supreeth

May 11, 2023

Although it is orthogonal to the objective of this space, including results of closed models from Google, OpenAI, Anthropic, Cohere etc on the same benchmarks would help users find open source LLM that are close enough to the closed LLM's for their particular use case. It would greatly reduce the time spent on experimentation

Thank You!

TerraNull

May 24, 2023

Second this, cause it's easier to distinguish something when I already have a reference for it. chatgpt 3.5 and chatgpt 4 are anchors that alot of people are likely to know.

clefourrier

Hugging Face H4 org Jul 13, 2023

Hi! We won't do this, as this is a leaderboard for Open models, both for philosophical reasons (openness is cool) and for practical reasons: we want to ensure that the results we display are accurate and reproducible, but 1) commercial closed models can change their API thus rendering any scoring at a given time incorrect 2) we re-run everything on our cluster to ensure all models are run on the same setup and you can't do that for these models

clefourrier changed discussion status to closed Jul 13, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment