Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1049

LLM Evaluation not moving forward

#298

by SwatCat - opened Sep 27, 2023

Discussion

SwatCat

Sep 27, 2023

Hi,

Leaderboard evaluation has been stuck for a while. Is there any update on when it might be resumed.

Yhyu13

Sep 28, 2023

Maybe there is just no pending models? There are some models released recently that aren't llama at its base. Like Baichuan series, InternLM 20B, and Qwen14B

Since Tech companies and research facilities focuses more on multi-modality recently, this text-only benchmark would not be as comprehensive evaluation as before.

But still, it's a good playground for advancing text dataset and fine-tuning technique even further.

SaylorTwift

Open LLM Leaderboard org Sep 28, 2023

Hi, we still have many models being submitted by the community, however, we are preparing an update in the evaluations used by the leaderboard, we are therefore doing a small pause to catch up with all the models we already evaluated and provide the leaderboard with even more precise benchmarks :)

Yhyu13

Sep 28, 2023

@SaylorTwift What kind of upgrade? The benchmarks dataset will be changed? Or their weighting contributing to the overall score?

SaylorTwift

Open LLM Leaderboard org Sep 28, 2023

The current benchmarks will be left untouched, but we are experimenting with adding other benchmark to give a better view of model's performance

SaylorTwift changed discussion status to closed Sep 28, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment