Spaces:
Sleeping
title: Mini Datathon
emoji: 💻
colorFrom: blue
colorTo: yellow
sdk: streamlit
sdk_version: 1.40.1
app_file: app.py
pinned: true
Mini Datathon
This datathon platform is fully developped in python using streamlit with very few lines of code!
As written in the title, it is designed for small datathon (but can easily scale) and the scripts are easy to understand.
Example
In the deployed version, we have the UCI Secom imbalanced dataset (binary classification) and evaluated by the PR-AUC score:
in the config.py file you would need to fill the following parameters:
GREATER_IS_BETTER = True
SKLEARN_SCORER = average_precision_score
SKLEARN_ADDITIONAL_PARAMETERS = {'average': 'micro'}
- upload the relevant data the your Google Drive & share the links.
Behind the scenes
Databases
The platform needs only 2 components to be saved:
The leaderboard
The leaderboard is in fact a csv file that is being updated everytime a user submit predictions. The csv file contains 4 columns:
- id: the login of the team
- score: the best score of the team
- nb_submissions: the number of submissions the team uploads
- rank: the live rank of the team
We will have only 1 row per team since only the best score is being saved.
By default, a benchmark score is pushed to the leaderboard:
id | score |
---|---|
benchmark | 0.6 |
For more details, please refer to the script leaderboard.
The users
Like the leaderboard, it is a csv file. It is supposed to be defined by the admin of the competition. It contains 2 columns:
- login
- password
A default user is created at first to begin to play with the platform:
login | password |
---|---|
admin | password |
In order to add new participants, simply add rows to the current users.csv file.
For more details, please refer to the script users.
License
MIT License here.
If you like this project, let me know by buying me a coffee :)