Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1048

What are "raw" metrics?

#856

by aginart-salesforce - opened Jul 23

Discussion

aginart-salesforce

Jul 23

What is the difference between the "raw" scores and the scores in used in the leaderboard?

And how can you convert from the lm-eval outputs to the scores in the leaderboard? Does the lm-eval-harness output the raw score?

alozowski

Open LLM Leaderboard org Jul 23

Hi @aginart-salesforce ,

Please, check this page in our documentation about scores normalization. If anything remains unclear, you can ping me here and I'll try to explain :)

alozowski changed discussion status to closed Jul 23

leichao

Jul 29

I understand the logic of score normalization, but will this normalization be done when doing local leaderboard model evaluation? Because when I do local evaluation, I only have the original score but no normalized score. Thank you @alozowski

clefourrier

Open LLM Leaderboard org Jul 29

Hi! No, you need to compute it yourself (using the snippets in the doc) to get results when doing a local evaluation. We will soon provide scripts to reproduce the scores precisely.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment