Any csv format for the leaderboard?

#4
by zhiminy - opened

I want to download the leaderboard as CSV format...

I don't know any way to get a clean csv from the gradio dataframe
But, this leaderboard follows the same format from the HuggingFaceH4/open_llm_leaderboard
So the community_tools, including the https://github.com/Weyaxi/scrape-open-llm-leaderboard, should in principle work here, you just need to change the references

An easier way to get the leaderboard results is to process .json files in this repository eduagarcia-temp/llm_pt_leaderboard_requests, you just need to filter the requests with the 'status' as 'FINISHED'

Example code:

from glob import glob
import pandas as pd
from huggingface_hub import snapshot_download
import json

snapshot_download(repo_id='eduagarcia-temp/llm_pt_leaderboard_requests', local_dir='./eval-queue', repo_type="dataset")

finished_evals = []
for p in glob("./eval-queue/**/*.json", recursive=True):
    with open(p, 'r') as f:
        data = json.load(f)
    if data['status'] == 'FINISHED':
        data.update(data['result_metrics'])
        del data['result_metrics']
        finished_evals.append(data)
df = pd.DataFrame(finished_evals)
df.to_csv('leaderboard.csv', index=False)

For more detailed metrics look inside these repositories: eduagarcia-temp/llm_pt_leaderboard_results and eduagarcia-temp/llm_pt_leaderboard_raw_results

eduagarcia changed discussion status to closed

Sign up or log in to comment