Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Spaces:
CoreyMorris
/
MMLU-by-task-Leaderboard
like
13
Running
App
Files
Files
Community
4
main
MMLU-by-task-Leaderboard
4 contributors
History:
185 commits
CoreyMorris
updated with new data
e05c716
about 1 month ago
.github
added a test and removed the code to only test a specific file because that code did not work
9 months ago
.gitattributes
1.52 kB
initial commit
10 months ago
.gitignore
68 Bytes
updated gitignore
9 months ago
.gitmodules
106 Bytes
added hugging face evaluation harness results submodule
10 months ago
README.md
202 Bytes
updated readme and requirements
7 months ago
app.py
15.6 kB
updated with new data
about 1 month ago
contaminated_models.csv
117 Bytes
Updated contaminated models
9 months ago
contaminated_models.txt
65 Bytes
Updated contaminated models
9 months ago
details_data_processor.py
4.04 kB
updated pipeline and init
9 months ago
dev_requirements.txt
252 Bytes
updated dev requirements
9 months ago
generate_csv.ipynb
25.8 kB
update
6 months ago
moral_app.py
11.1 kB
Extracted plotting functions from moral_app to plotting_utils to improve organization and testability
9 months ago
moral_scenarios_questions.csv
370 kB
Show a random question from the moral scenarios evaluation
9 months ago
plotting_utils.py
4.42 kB
Extracted plotting functions from moral_app to plotting_utils to improve organization and testability
9 months ago
processed_data_2023-09-29.csv
1.35 MB
Updated data and added notes about the site.
about 1 month ago
processed_data_2023-10-05.csv
1.35 MB
update
8 months ago
processed_data_2023-10-06.csv
1.62 MB
Added clickable links (#1)
7 months ago
processed_data_2023-10-08.csv
1.58 MB
added new result data
7 months ago
processed_data_2023-11-18.csv
1.18 MB
updated dashboard with new data
6 months ago
processed_data_2023-11-21.csv
1.25 MB
Updated with new results 11-21
6 months ago
processed_data_2024-04-16.csv
5.74 MB
updated with new data
about 1 month ago
requirements.txt
160 Bytes
updated readme and requirements
7 months ago
result_data.csv
1.35 MB
updated
8 months ago
result_data_processor.py
8.29 kB
Added clickable links (#1)
7 months ago
save_for_regression.py
1.86 kB
changed to save and load in a directory
9 months ago
split_question.py
964 Bytes
added code to split moral scenario question from one question to two
9 months ago
test_details_data_processing.py
4.33 kB
added a test
9 months ago
test_integration.py
1.96 kB
fixed test_streamlit_app_runs
9 months ago
test_paths.py
780 Bytes
added a test and removed the code to only test a specific file because that code did not work
9 months ago
test_regression.py
1.26 kB
added todo for test
9 months ago
test_result_data_processing.py
1.66 kB
Added organization to dataframe
9 months ago