Spaces:
Sleeping
Sleeping
# Presentation of the challenge | |
context_markdown = """ | |
The Goal of the first challenge is to estimate the category of the uploaded youtube video. | |
""" | |
content_markdown = """ | |
### Multi-class Problem | |
It has the following features/target: | |
#### Features | |
- local_path: The local path to the upload's data. | |
- upload_id: The unique identifier of the upload. | |
- clean_upload_id: The upload_id with the "suicide_out_" prefix removed. | |
- upload_type: An enumeration representing the type of upload. Default is UploadType.GENERAL. | |
- features: A dictionary containing additional features associated with the upload. | |
- title: The title of the upload. | |
- playlist_title: The title of the playlist the upload belongs to. | |
- description: The description of the upload. | |
- duration_string: The duration of the upload in string format. | |
- duration: The duration of the upload in seconds. | |
- upload_date: The date when the upload was uploaded. | |
- view_count: The number of views the upload has received. | |
- comment_count: The number of comments on the upload. | |
- like_count: The number of likes on the upload. | |
- tags: The tags associated with the upload. | |
### Target | |
- categories: The categories associated with the upload. | |
You can find the details about the context/data/challenge [here](https://drive.google.com/file/d/1qyEmi6UUWlyzeVPhPnqY2JNRHBPutak-/view?usp=sharing) | |
""" | |
#------------------------------------------------------------------------------------------------------------------# | |
# Guide for the participants to get X_train, y_train and X_test | |
# The google link can be placed in your google drive => get the shared links and place them here. | |
data_instruction_commands = """ | |
The data can be parsed using the [youtube_modules.py](https://drive.google.com/file/d/1FCKpBTvTdL2RoNpIp9fHY18006CiglT2/view?usp=drive_link) script. | |
You can find the readme [here](https://drive.google.com/file/d/1wBJmwfZ9JzcQ0MxvwamYxBjwYbpEsgMx/view?usp=drive_link) | |
```python | |
from youtube_modules import * | |
import pickle | |
import random | |
import numpy as np | |
train_uploads: List[Upload] = pickle.load(open("<path/to/data>/train_uploads.pkl", 'rb' )) | |
test_uploads: List[Upload] = pickle.load(open("<path/to/data>/test_uploads.pkl", 'rb' )) | |
``` | |
Make sure to upload your predictions as a .csv file with the columns: "id" (range(len(test_file))) and "label" (1, 2, 3). | |
## Quickstart: use notebook remotely | |
1. conda activate py38_default | |
2. notebook load from remote | |
$ jupyter notebook --ip=0.0.0.0 --no-browser | |
then after receiving the URL copied and put it in your browser | |
https://127.0.0.1:8888/?token=7de849a953befd20682d57ac33b3e6cd9024ca25eed2433 | |
Then replace 127.0.0.1 with your I.P. e.g | |
https://1.222.333.4:8888/?token=7de849a953befd20682d57ac33b3e6cd9024ca25eed24336 | |
""" | |
# Target on test (hidden from the participants) | |
Y_TEST_GOOGLE_PUBLIC_LINK = 'https://drive.google.com/file/d/1gQ3_ywJElpcBrewCFhVUM-fnV4SN62na/view?usp=sharing' | |
#------------------------------------------------------------------------------------------------------------------# | |
# Evaluation metric and content | |
from sklearn.metrics import f1_score | |
GREATER_IS_BETTER = True # example for ROC-AUC == True, for MSE == False, etc. | |
SKLEARN_SCORER = f1_score | |
SKLEARN_ADDITIONAL_PARAMETERS = {'average': 'weighted'} | |
evaluation_content = """ | |
The predictions are evaluated according to the f1-score (weighted). | |
You can get it using | |
```python | |
from sklearn.metrics import f1_score | |
f1_score(y_train, y_pred_train, average='weighted') | |
``` | |
More details [here](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html#sklearn.metrics.f1_score). | |
""" | |
#------------------------------------------------------------------------------------------------------------------# | |
# leaderboard benchmark score, will be displayed to everyone | |
BENCHMARK_SCORE = 0.2 | |
#------------------------------------------------------------------------------------------------------------------# |