Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
CoreyMorris
/
MMLU-by-task-Leaderboard
like
13
Running
App
Files
Files
Community
4
298ba1f
MMLU-by-task-Leaderboard
4 contributors
History:
148 commits
Corey Morris
copied main streamlit application to one that will specifically investigate moral reasoning
298ba1f
over 1 year ago
.github
added a test and removed the code to only test a specific file because that code did not work
over 1 year ago
.gitattributes
Safe
1.52 kB
initial commit
over 1 year ago
.gitignore
Safe
63 Bytes
updated gitignore
over 1 year ago
.gitmodules
Safe
106 Bytes
added hugging face evaluation harness results submodule
over 1 year ago
README.md
Safe
248 Bytes
initial commit
over 1 year ago
app.py
Safe
16 kB
updated date and model count
over 1 year ago
contaminated_models.csv
Safe
117 Bytes
Updated contaminated models
over 1 year ago
contaminated_models.txt
Safe
65 Bytes
Updated contaminated models
over 1 year ago
details_data_processor.py
Safe
4.04 kB
updated pipeline and init
over 1 year ago
dev_requirements.txt
Safe
175 Bytes
Updated dependencies
over 1 year ago
moral_app.py
14.8 kB
copied main streamlit application to one that will specifically investigate moral reasoning
over 1 year ago
moral_scenarios_questions.csv
Safe
370 kB
Show a random question from the moral scenarios evaluation
over 1 year ago
requirements.txt
Safe
156 Bytes
Updated dependencies
over 1 year ago
result_data_processor.py
Safe
6.19 kB
Returning just a single file per model directory. Manually removing gpt-j-6b for now because there is something that is causing problems with processing the data
over 1 year ago
save_for_regression.py
Safe
1.86 kB
changed to save and load in a directory
over 1 year ago
test_details_data_processing.py
Safe
4.33 kB
added a test
over 1 year ago
test_integration.py
Safe
1.96 kB
fixed test_streamlit_app_runs
over 1 year ago
test_paths.py
Safe
780 Bytes
added a test and removed the code to only test a specific file because that code did not work
over 1 year ago
test_regression.py
Safe
1.26 kB
added todo for test
over 1 year ago
test_result_data_processing.py
Safe
1.66 kB
Added organization to dataframe
over 1 year ago