Spaces:
Running
Running
from dataclasses import dataclass | |
from enum import Enum | |
class Task: | |
benchmark: str | |
metric: str | |
col_name: str | |
# Select your tasks here | |
# --------------------------------------------------- | |
class Tasks(Enum): | |
# task_key in the json file, metric_key in the json file, name to display in the leaderboard | |
task0 = Task("anli_r1", "acc", "ANLI") | |
task1 = Task("logiqa", "acc_norm", "LogiQA") | |
NUM_FEWSHOT = 0 # Change with your few shot | |
# --------------------------------------------------- | |
class nc_tasks(Enum): | |
task0 = Task("rel-amazon/user-churn", "auroc", "user-churn") | |
task1 = Task("rel-amazon/item-churn", "auroc", "item-churn") | |
task2 = Task("rel-avito/user-clicks", "auroc", "user-clicks") | |
task3 = Task("rel-avito/user-visits", "auroc", "user-visits") | |
task4 = Task("rel-hm/user-churn", "auroc", "hm-user-churn") | |
task5 = Task("rel-stack/user-badge", "auroc", "user-badge") | |
task6 = Task("rel-stack/user-engagement", "auroc", "user-engagement") | |
task7 = Task("rel-f1/driver-dnf", "auroc", "driver-dnf") | |
task8 = Task("rel-f1/driver-top3", "auroc", "driver-top3") | |
task9 = Task("rel-trial/study-outcome", "auroc", "study-outcome") | |
task10 = Task("rel-event/user-repeat", "auroc", "user-repeat") | |
task11 = Task("rel-event/user-ignore", "auroc", "user-ignore") | |
# Your leaderboard name | |
TITLE = """<p align="center"><img src="https://relbench.stanford.edu/img/logo.png" alt="logo" width="400px" /></p>""" | |
# What does your leaderboard evaluate? | |
INTRODUCTION_TEXT = """ | |
Relational Deep Learning is a new approach for end-to-end representation learning on data spread across multiple tables, such as in a relational database (see our vision paper). RelBench is the accompanying benchmark which seeks to facilitate efficient, robust and reproducible research in this direction. It comprises of a collection of realistic, large-scale, and diverse datasets structured as relational tables, along with machine learning tasks defined on them. It provides full support for data downloading, task specification and standardized evaluation in an ML-framework-agnostic manner. Additionally, there is seamless integration with PyTorch Geometric to load the data as a graph and train GNN models, and with PyTorch Frame to encode the various types of table columns. Finally, there is a leaderboard for tracking progress. | |
""" | |
# Which evaluations are you running? how can people reproduce what you have? | |
LLM_BENCHMARKS_TEXT = f""" | |
## Overview of RelBench | |
""" | |
EVALUATION_QUEUE_TEXT = """ | |
## Instruction to submit your model | |
Once you have developed your model and got results, you can submit your test results to our leaderboards. For each dataset, we require you to submit the following information. | |
- **Your name**: Primary contact's name | |
- **Your email**: Primary contact's email | |
- **RelBench version**: The RelBench version used to conduct the experiments. | |
- **Model name**: The name of the method. This is an unique identifier of the model. Please be distinct with any existing model names. It will be overriden if the same model name is submitted. | |
- **Task name**: The name of an Relbench dataset that you use to evaluate the method. Choose from the dropdown menus. | |
- **Is it an official submission**: Whether the implementation is official (implementation by authors who proposed the method) or unofficial (re-implementation of the method by non-authors). | |
- **Test performance**: Raw test performance output by RelBench model evaluators, where average and unbiased standard deviation must be taken over 5 different random seeds. You can either not fix random seeds at all, or use the random seeds from 0 to 4. We highly discourage you to tune the random seeds. | |
- **Validation performance**: Validation performance of the model that is used to report the test performance above. | |
- **Paper URL Link**: The original paper describing the method (arXiv link is recommended. paper needs not be peer-reviewed). If your method has any original component (e.g., even just combining existing methods XXX and YYY), you have to write a technical report describing it (e.g., how you exactly combined XXX and YYY). | |
- **GitHub URL Link**: The Github repository or directory containining all code to reproduce the result. A placeholder repository is not allowed. | |
- **Number of Parameters**: The number of parameters of your model, which can be calculated by sum(p.numel() for p in model.parameters()). If you use multi-stage training (e.g., apply node2vec and then MLP), please sum up all the parameters (both node2vec and MLP parameters). | |
- **Honor code**: Please acknowledge that your submission adheres to all the ethical policies and your result is reproducible. | |
""" | |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results" | |
CITATION_BUTTON_TEXT = r""" | |
@article{relbench, | |
title={Relational Deep Learning: Graph Representation Learning on Relational Tables}, | |
author={Matthias Fey, Weihua Hu, Kexin Huang, Jan Eric Lenssen, Rishabh Ranjan, Joshua Robinson, Rex Ying, Jiaxuan You, Jure Leskovec}, | |
year={2023} | |
} | |
""" | |