Spaces:

lmsys
/

chatbot-arena-leaderboard

Running

App Files Files Community

Removing LLM version clutter from the leaderboard ?

#20

by zarglu - opened Feb 5

Discussion

zarglu

Feb 5

Top ten currently has 4 versions of GPT 4, 3 versions of Claude and 2 versions of Gemini.

When every new LLM clearly outclassed its predecessors this was no issue, but with the incremental improvements lately, this makes the leaderboard feel cluttered and less useful.

Maybe have a main leaderboard where only the top performing major version is listed ?
ie. only one entry for GPT4.x (the best one), only one entry for Claude 2.x, only one entry for Claude 1.x, etc.

endolith

Feb 9

every new LLM clearly outclassed its predecessors

They didn't, though. GPT-4-0314 was better than GPT-4-0613.

laduopan

Mar 28

I found that the GPT-3.5-turbo-0301 version was mistakenly labeled as GPT-3.5-turbo-0314, which is confused with the GPT-4-0314 version.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment