Spaces:

CoreyMorris
/

MMLU-by-task-Leaderboard

Running

App Files Files Community

MMLU-by-task-Leaderboard

4 contributors

History: 164 commits

Corey Morris

stripping the whitespace from the input so that the filtering works with or without whitespace

e1345be about 1 year ago

.github
added a test and removed the code to only test a specific file because that code did not work over 1 year ago
.gitattributes

1.52 kB

initial commit over 1 year ago
.gitignore

68 Bytes

updated gitignore about 1 year ago
.gitmodules

106 Bytes

added hugging face evaluation harness results submodule over 1 year ago
README.md

248 Bytes

initial commit over 1 year ago
app.py

15.8 kB

stripping the whitespace from the input so that the filtering works with or without whitespace about 1 year ago
contaminated_models.csv

117 Bytes

Updated contaminated models over 1 year ago
contaminated_models.txt

65 Bytes

Updated contaminated models over 1 year ago
details_data_processor.py

4.04 kB

updated pipeline and init over 1 year ago
dev_requirements.txt

252 Bytes

updated dev requirements about 1 year ago
moral_app.py

11.1 kB

Extracted plotting functions from moral_app to plotting_utils to improve organization and testability about 1 year ago
moral_scenarios_questions.csv

370 kB

Show a random question from the moral scenarios evaluation over 1 year ago
plotting_utils.py

4.42 kB

Extracted plotting functions from moral_app to plotting_utils to improve organization and testability about 1 year ago
requirements.txt

156 Bytes

Updated dependencies over 1 year ago
result_data.csv

1.35 MB

updated about 1 year ago
result_data_processor.py

6.77 kB

WIP. Loading data from csv about 1 year ago
save_for_regression.py

1.86 kB

changed to save and load in a directory over 1 year ago
split_question.py

964 Bytes

added code to split moral scenario question from one question to two about 1 year ago
test_details_data_processing.py

4.33 kB

added a test over 1 year ago
test_integration.py

1.96 kB

fixed test_streamlit_app_runs over 1 year ago
test_paths.py

780 Bytes

added a test and removed the code to only test a specific file because that code did not work over 1 year ago
test_regression.py

1.26 kB

added todo for test over 1 year ago
test_result_data_processing.py

1.66 kB

Added organization to dataframe over 1 year ago