Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Spaces:
CoreyMorris
/
MMLU-by-task-Leaderboard
like
13
Running
App
Files
Files
Community
4
refs/pr/2
MMLU-by-task-Leaderboard
4 contributors
History:
156 commits
imdatta0
Support for multi column filtering using comma seperated values
135f2a9
9 months ago
.github
added a test and removed the code to only test a specific file because that code did not work
10 months ago
.gitattributes
1.52 kB
initial commit
11 months ago
.gitignore
68 Bytes
updated gitignore
9 months ago
.gitmodules
106 Bytes
added hugging face evaluation harness results submodule
11 months ago
README.md
248 Bytes
initial commit
11 months ago
app.py
16.1 kB
Support for multi column filtering using comma seperated values
9 months ago
contaminated_models.csv
117 Bytes
Updated contaminated models
10 months ago
contaminated_models.txt
65 Bytes
Updated contaminated models
10 months ago
details_data_processor.py
4.04 kB
updated pipeline and init
10 months ago
dev_requirements.txt
252 Bytes
updated dev requirements
9 months ago
moral_app.py
11.1 kB
Extracted plotting functions from moral_app to plotting_utils to improve organization and testability
9 months ago
moral_scenarios_questions.csv
370 kB
Show a random question from the moral scenarios evaluation
10 months ago
plotting_utils.py
4.42 kB
Extracted plotting functions from moral_app to plotting_utils to improve organization and testability
9 months ago
requirements.txt
156 Bytes
Updated dependencies
10 months ago
result_data_processor.py
6.61 kB
Changed error logging from print statements to logger. It is not currently working to save to a file locally
9 months ago
save_for_regression.py
1.86 kB
changed to save and load in a directory
10 months ago
split_question.py
964 Bytes
added code to split moral scenario question from one question to two
9 months ago
test_details_data_processing.py
4.33 kB
added a test
10 months ago
test_integration.py
1.96 kB
fixed test_streamlit_app_runs
10 months ago
test_paths.py
780 Bytes
added a test and removed the code to only test a specific file because that code did not work
10 months ago
test_regression.py
1.26 kB
added todo for test
10 months ago
test_result_data_processing.py
1.66 kB
Added organization to dataframe
10 months ago