Open LLM Leaderboard org
No description provided.
Open LLM Leaderboard org
edited Apr 12

Changes in app.py:

  • Most importantly, I split up init_space() into three functions for clarity and now we have download_dataset() function with logging. May improve this function later.
  • Got rid of unnecessary copy creation with copy().
  • init_space() now returns only leaderboard_df, raw_data, eval_queue_dfs.
  • plot_df is only created when it's needed with the new load_and_create_plots() function.

Changes in populate.py:

  • commented prints :D
Open LLM Leaderboard org
edited Apr 12

Changes in collections.py:

  • split up the update_collections() function into distinct parts for filtering data, adding models to the collection, and cleaning up the collection – this should improve readability and maintainability.
  • the conversion and filtering of the params_column are now done only once per model type and size, rather than multiple times.

Changes in populate.py:

  • deleted prints :DD
Open LLM Leaderboard org

Changes in populate.py:

  • introduced two helper functions _load_json_data and _process_model_data.
  • enhanced readability
clefourrier changed pull request status to open
Open LLM Leaderboard org

@Wauplin the CI does not seem to open the sub spaces - do we need to change stg in the code?

Open LLM Leaderboard org

Following the creation of this PR, an ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been started. Any changes pushed to this PR will be synced with the test Space.
Since this PR has been created by a trusted author, the ephemeral Space has been configured with the correct hardware, storage, and secrets.
(This is an automated message.)

Open LLM Leaderboard org

Following new commits that happened in this PR, the ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been updated.
(This is an automated message.)

Open LLM Leaderboard org

Nice changes so far!

  • you'll need to remove the ## UPDATED comments
  • # Data processing for plots now only on demand in the respective Gradio tab > very nice
  • get_evaluation_queue_df > I think you could manage the save logic in one step, by first creating the correct entries list (storing both files and subfolders files) , then applying processing.
  • a function such as _load_json_data could go in an util.py file
  • I think the following keys might be incorrect (not your fault - your code highlights it well) - cc @SaylorTwift
        "PENDING": ["PENDING", "RERUN"],
        "RUNNING": ["RUNNING"],
        "FINISHED": ["FINISHED", "PENDING_NEW_EVAL"],
    }
Open LLM Leaderboard org

@Wauplin the CI does not seem to open the sub spaces - do we need to change stg in the code?

@clefourrier Looks like it now worked. Have you changed something? 🤔

Open LLM Leaderboard org
edited Apr 15

Have you changed something? 🤔

I tagged you XD

Open LLM Leaderboard org

Clearly this is magic and the CI recognizing their true master XD (could be a matter of publishing the branch too though ^^)

Open LLM Leaderboard org

could be a matter of publishing the branch too though ^^

Good catch yes! Only PRs with status "open" are deployed (see here, here and here). Will update to get it right!

Clearly this is magic

I don't want it to be magiiiiic ! 😭

Open LLM Leaderboard org

@alozowski for the CI to work, you'll need to add a small check after CACHE_PATH = os.getenv("HF_HOME", ".") in env to check if this is a path we have access to, else replace it by .

Open LLM Leaderboard org
edited Apr 15

Found a bug in my new app.py, the "private or deleted" button doesn't work – fixing

Open LLM Leaderboard org

Following new commits that happened in this PR, the ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been updated.
(This is an automated message.)

Open LLM Leaderboard org
edited Apr 16

Changes in app.py:

  • the bug was in row 97 with update_collections, I didn't return both original_df and leaderboard_df.

Changes in envs.py:

  • added a check after CACHE_PATH.

Changes in populate.py:

  • get_evaluation_queue_df now handles save_path with pathlib.

Changes in utils.py:

  • now it contains load_json_data function.

I've finished the changes I wanted to make here and I'm ready to merge, please check @clefourrier . Btw, I can apply Makefile style before merge

Open LLM Leaderboard org

Following new commits that happened in this PR, the ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been updated.
(This is an automated message.)

Open LLM Leaderboard org

Following new commits that happened in this PR, the ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been updated.
(This is an automated message.)

Open LLM Leaderboard org
edited Apr 16

Hi @alozowski :)

General comments

  • Looks good to merge once next items are fixed!
  • Not sure the check after CACHE_PATH is completely working, as the CI space is still broken for path reasons.
  • How did you test the collection udpate?

Specific code nits

  • app.py, line 80: the break should probably be removed as the space restart will leave the execution anyway
  • src/populate.py, line 33: nice idea to use rglob! Why not directly use '*.json' as pattern then? It will allow you to remove all checks
  • src/tools/collection.py, line 65: the [:10] is redundant since you define the number above already
Open LLM Leaderboard org

Following new commits that happened in this PR, the ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been updated.
(This is an automated message.)

Open LLM Leaderboard org
edited Apr 16

Debugging this CACHE_PATH problem, can be lots of commits

Open LLM Leaderboard org

Following new commits that happened in this PR, the ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been updated.
(This is an automated message.)

Open LLM Leaderboard org

No problem!

Open LLM Leaderboard org

Following new commits that happened in this PR, the ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been updated.
(This is an automated message.)

Open LLM Leaderboard org

Following new commits that happened in this PR, the ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been updated.
(This is an automated message.)

Open LLM Leaderboard org
edited Apr 16

Why does CI pass the HF_HOME env variable to which it has no write permission? Changing the env variables during the runtime doesn't seem to be a good thing and I won't be able to fix it myself 😱

Is it possible to make CI pass the correct parameters? @clefourrier

Open LLM Leaderboard org

The CI does not have the same hardware as the front end of the leaderboard - when you have attached permanent storage (like the leaderboard), it's mounted at "/data" - but here the CI space does not have this, hence needs to store things at "."

Open LLM Leaderboard org

@Wauplin can we use different env vars for the CI?

Open LLM Leaderboard org

Following new commits that happened in this PR, the ephemeral Space HuggingFaceH4/open_llm_leaderboard-ci-pr-671 has been updated.
(This is an automated message.)

Open LLM Leaderboard org

In the meantime, I changed the other parts:

  • If update_collections is broken or applied incorrectly, you will not be able to filter the models in the leaderboard table – but it's my manual testing so far + checking with prints before and after... I'm open to any ideas on how to create better tests for such changes
  • fixed app.py line 80
  • you're right @clefourrier , the "*.json" pattern in src/populate.py works well without other checks
  • fixed line 65 in src/tools/collections.py

p.s. the name of the last commit should've been "small fixes", a typo :DD

Open LLM Leaderboard org

update_collection should not be related to the filtering in the table - if it fails, it should log it but not affect the rest of the space

Open LLM Leaderboard org

Otherwise, looks good to me! Merging!

clefourrier changed pull request status to merged
Open LLM Leaderboard org

PR is now merged/closed. The ephemeral Space has been deleted.
(This is an automated message.)

Open LLM Leaderboard org

It's way 'deeper' with update_collection as I investigated... I'll explain later /mysteriously leaves/ :D

Thanks for the merge! 🤩

Sign up or log in to comment