Spaces:

HuggingFaceH4
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

728

Need more model diversity

#64

by spaceman7777 - opened Jun 12, 2023

Discussion

spaceman7777

Jun 12, 2023

So, I've been waiting for benchmarks on other models such as RWKV raven 14b, and the small collection of other high performing non-llama models.

The leaderboard is unfortunately 95% llama-based, so, in the cases that there are non-llama models to benchmark, it would be best to set the testing priority of non-llama models higher

natolambert

Jun 15, 2023

Yeah we're targeting this for the next batch of human / gpt4 evals.
As for which models are on the first tab, it's primarily driven by what users submit.

clefourrier changed discussion status to closed Jun 16, 2023

spaceman7777

Jun 26, 2023

•

edited Jun 26, 2023

Is there an issue with running RWKV raven 14b? It seems to have been in the running state for something like three weeks now, and there still aren't any results for any rwkv variants. I'd guess that there must be some kind of configuration issue? I assume things are semi-paused though because of the blog post?

Anyway. Just wanted to ping that rwkv models are most likely stuck

(There still isn't an rwkv based model on the leaderboard)

spaceman7777 changed discussion status to open Jun 26, 2023

clefourrier

Hugging Face H4 org Jul 13, 2023

Hi @spaceman7777 ! We released a very big update of the LLM leaderboard today, and we'll focus on going through the backlog of models (some have been stuck for quite a bit)

Thank you for your patience :)

clefourrier changed discussion status to closed Jul 24, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment