Spaces:
Running
Running
Sonja Topf
commited on
Commit
·
25fddff
1
Parent(s):
759c8fc
final commit
Browse files- MODEL_CARD.md +30 -0
- README.md +5 -5
- app.py +3 -3
- src/preprocess.py +1 -1
MODEL_CARD.md
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Model card - tox21_chemprop_classifier
|
| 2 |
+
### Model details
|
| 3 |
+
- Model name: Graph Isomorphism Network Tox21 Baseline
|
| 4 |
+
- Developer: MIT & Stanford (trained by JKU Linz)
|
| 5 |
+
- Paper URL: https://arxiv.org/abs/1810.00826
|
| 6 |
+
- Model type / architecture:
|
| 7 |
+
- Graph Isomorphism Network implemented using PyTorch.
|
| 8 |
+
- Hyperparameters: [link to config](https://huggingface.co/spaces/ml-jku/tox21_gin_classifier/blob/main/config/config.json)
|
| 9 |
+
- A multitask network is trained for all Tox21 targets.
|
| 10 |
+
- Inference: Access via FastAPI endpoint. Upon a Tox21 prediction request, the model
|
| 11 |
+
generates and returns predictions for all Tox21 targets simultaneously.
|
| 12 |
+
- Model version: v0
|
| 13 |
+
- Model date: 14.10.2025
|
| 14 |
+
- Reproducibility: Code for full training is available and enables retraining of the model from
|
| 15 |
+
scratch.
|
| 16 |
+
|
| 17 |
+
### Intended use
|
| 18 |
+
This model serves as a baseline benchmark for evaluating and comparing toxicity prediction
|
| 19 |
+
methods across the 12 pathway assays of the Tox21 dataset. It is not intended for clinical
|
| 20 |
+
decision-making without experimental validation.
|
| 21 |
+
|
| 22 |
+
### Metric
|
| 23 |
+
Each Tox21 task is evaluated using the area under the receiver operating characteristic curve
|
| 24 |
+
(AUC). Overall performance is reported as the mean AUC across all individual tasks.
|
| 25 |
+
|
| 26 |
+
### Training data
|
| 27 |
+
Tox21 training and validation sets.
|
| 28 |
+
|
| 29 |
+
### Evaluation data
|
| 30 |
+
Tox21 test set.
|
README.md
CHANGED
|
@@ -5,20 +5,20 @@ colorFrom: green
|
|
| 5 |
colorTo: blue
|
| 6 |
sdk: docker
|
| 7 |
pinned: false
|
| 8 |
-
license:
|
| 9 |
short_description: Graph Isomorphism Network Baseline Classifier for Tox21
|
| 10 |
---
|
| 11 |
|
| 12 |
# Tox21 Graph Isomorphism Network Classifier
|
| 13 |
|
| 14 |
-
This repository hosts a Hugging Face Space that provides an examplary API for submitting models to the [Tox21 Leaderboard](https://huggingface.co/spaces/
|
| 15 |
|
| 16 |
In this example, we trained a GIN classifier on the Tox21 targets and saved the trained model in the `checkpoints/` folder.
|
| 17 |
|
| 18 |
-
**Important:** For leaderboard submission, your Space needs to include training code. The file `train.py` should train the model using the config specified inside the `config/` folder and save the final model parameters into a file inside the `checkpoints/` folder. The model should be trained using the [Tox21_dataset](https://huggingface.co/datasets/
|
| 19 |
```python
|
| 20 |
from datasets import load_dataset
|
| 21 |
-
ds = load_dataset("
|
| 22 |
train_df = ds["train"].to_pandas()
|
| 23 |
val_df = ds["validation"].to_pandas()
|
| 24 |
```
|
|
@@ -60,7 +60,7 @@ That’s it, your model will be available as an API endpoint for the Tox21 Leade
|
|
| 60 |
To run the GIN classifier, clone the repository and install dependencies:
|
| 61 |
|
| 62 |
```bash
|
| 63 |
-
git clone https://huggingface.co/spaces/
|
| 64 |
cd tox21_gin_classifier
|
| 65 |
pip install -r requirements.txt
|
| 66 |
```
|
|
|
|
| 5 |
colorTo: blue
|
| 6 |
sdk: docker
|
| 7 |
pinned: false
|
| 8 |
+
license: cc-by-nc-4.0
|
| 9 |
short_description: Graph Isomorphism Network Baseline Classifier for Tox21
|
| 10 |
---
|
| 11 |
|
| 12 |
# Tox21 Graph Isomorphism Network Classifier
|
| 13 |
|
| 14 |
+
This repository hosts a Hugging Face Space that provides an examplary API for submitting models to the [Tox21 Leaderboard](https://huggingface.co/spaces/ml-jku/tox21_leaderboard).
|
| 15 |
|
| 16 |
In this example, we trained a GIN classifier on the Tox21 targets and saved the trained model in the `checkpoints/` folder.
|
| 17 |
|
| 18 |
+
**Important:** For leaderboard submission, your Space needs to include training code. The file `train.py` should train the model using the config specified inside the `config/` folder and save the final model parameters into a file inside the `checkpoints/` folder. The model should be trained using the [Tox21_dataset](https://huggingface.co/datasets/ml-jku/tox21) provided on Hugging Face. The datasets can be loaded like this:
|
| 19 |
```python
|
| 20 |
from datasets import load_dataset
|
| 21 |
+
ds = load_dataset("ml-jku/tox21", token=token)
|
| 22 |
train_df = ds["train"].to_pandas()
|
| 23 |
val_df = ds["validation"].to_pandas()
|
| 24 |
```
|
|
|
|
| 60 |
To run the GIN classifier, clone the repository and install dependencies:
|
| 61 |
|
| 62 |
```bash
|
| 63 |
+
git clone https://huggingface.co/spaces/ml-jku/tox21_gin_classifier
|
| 64 |
cd tox21_gin_classifier
|
| 65 |
pip install -r requirements.txt
|
| 66 |
```
|
app.py
CHANGED
|
@@ -44,8 +44,8 @@ def root():
|
|
| 44 |
@app.get("/metadata")
|
| 45 |
def metadata():
|
| 46 |
return {
|
| 47 |
-
"name": "
|
| 48 |
-
"version": "1.0
|
| 49 |
"max_batch_size": 256,
|
| 50 |
"tox_endpoints": [
|
| 51 |
"NR-AR",
|
|
@@ -74,5 +74,5 @@ def predict(request: Request):
|
|
| 74 |
predictions = predict_func(request.smiles)
|
| 75 |
return {
|
| 76 |
"predictions": predictions,
|
| 77 |
-
"model_info": {"name": "
|
| 78 |
}
|
|
|
|
| 44 |
@app.get("/metadata")
|
| 45 |
def metadata():
|
| 46 |
return {
|
| 47 |
+
"name": "Tox21 GIN Classifier",
|
| 48 |
+
"version": "0.1.0",
|
| 49 |
"max_batch_size": 256,
|
| 50 |
"tox_endpoints": [
|
| 51 |
"NR-AR",
|
|
|
|
| 74 |
predictions = predict_func(request.smiles)
|
| 75 |
return {
|
| 76 |
"predictions": predictions,
|
| 77 |
+
"model_info": {"name": "Tox21 GIN Classifier", "version": "0.1.0"},
|
| 78 |
}
|
src/preprocess.py
CHANGED
|
@@ -9,7 +9,7 @@ from torch_geometric.utils import from_rdmol
|
|
| 9 |
from datasets import load_dataset
|
| 10 |
|
| 11 |
def get_tox21_split(token, cvfold=None):
|
| 12 |
-
ds = load_dataset("
|
| 13 |
|
| 14 |
train_df = ds["train"].to_pandas()
|
| 15 |
val_df = ds["validation"].to_pandas()
|
|
|
|
| 9 |
from datasets import load_dataset
|
| 10 |
|
| 11 |
def get_tox21_split(token, cvfold=None):
|
| 12 |
+
ds = load_dataset("ml-jku/tox21", token=token)
|
| 13 |
|
| 14 |
train_df = ds["train"].to_pandas()
|
| 15 |
val_df = ds["validation"].to_pandas()
|