sctherapy-artifacts
Trained checkpoints, derived data, and evaluation results for the sctherapy_pytorch project β predicting drug response (% inhibition) from gene expression, benchmarked on a 12-patient AML cohort from the scTherapy paper.
Code: Tino3141/sctherapy_pytorch
(FT-Transformer, GRANDE, LightGBM baseline) β the training/eval code, scripts,
and configs that produced these artifacts. This repo is self-contained:
checkpoints, the full training set, both eval cohorts, and results are all here.
(Training data is also mirrored at Tino3141/lincs-pharmaco-training /
Tino3141/pharmaco_flat.)
The Python snippets below (
from eval.model_registry import ...,from src.model.lgbm import ...) assume you're running inside a clone of the GitHub repo; this repo holds the artifacts, not the code.
Checkpoints
Neural (checkpoints/neural/) β the exact checkpoints used in the AML eval
Pulled from Weights & Biases (cpinkl/pmlr-sctherapy). These are the precise
epoch checkpoints behind the per-seed predictions in results/aml_evals/ β
not the final best-val (epoch-10) snapshots. The eval used an earlier epoch
(6β9) per run; the filename encodes the epoch, and each .pt's stored epoch
field matches it. Complete {42, 43, 44} seed sweep Γ 4 variants = 12 files.
| File | Arch | Head | Seed | Epoch used | W&B run | val_rmse @ epoch |
|---|---|---|---|---|---|---|
ft_transformer_hill__s42__epoch7__run-mwwp1nej.pt |
FT-Transformer | Hill | 42 | 7 | mwwp1nej | 10.87 |
ft_transformer_hill__s43__epoch7__run-k36em957.pt |
FT-Transformer | Hill | 43 | 7 | k36em957 | 10.81 |
ft_transformer_hill__s44__epoch7__run-dnyzw2jk.pt |
FT-Transformer | Hill | 44 | 7 | dnyzw2jk | 10.72 |
ft_transformer_scalar__s42__epoch7__run-lv6eowft.pt |
FT-Transformer | Scalar | 42 | 7 | lv6eowft | 11.01 |
ft_transformer_scalar__s43__epoch9__run-1lx7r1vp.pt |
FT-Transformer | Scalar | 43 | 9 | 1lx7r1vp | 10.91 |
ft_transformer_scalar__s44__epoch8__run-5mtffjfa.pt |
FT-Transformer | Scalar | 44 | 8 | 5mtffjfa | 10.74 |
grande_hill__s42__epoch8__run-fhszu55u.pt |
GRANDE | Hill | 42 | 8 | fhszu55u | 11.47 |
grande_hill__s43__epoch8__run-0vsn6313.pt |
GRANDE | Hill | 43 | 8 | 0vsn6313 | 11.56 |
grande_hill__s44__epoch6__run-xutm14sz.pt |
GRANDE | Hill | 44 | 6 | xutm14sz | 11.72 |
grande_scalar__s42__epoch7__run-izgn7e8j.pt |
GRANDE | Scalar | 42 | 7 | izgn7e8j | 30.78 |
grande_scalar__s43__epoch6__run-iqngqwll.pt |
GRANDE | Scalar | 43 | 6 | iqngqwll | 31.00 |
grande_scalar__s44__epoch8__run-k2bepvf1.pt |
GRANDE | Scalar | 44 | 8 | k2bepvf1 | 30.45 |
Each .pt is a dict with model_state_dict, epoch, val_rmse. Load with
the project's eval/model_registry.py (state-dict introspection rebuilds the
architecture β no separate config needed), or via build_model after reading
the shapes.
Why these epochs (not best-val): the evaluated checkpoint was chosen on a combination of the mean and standard deviation of validation performance β favouring a stable epoch over the single lowest-mean snapshot. So the selected epoch (6β9) deliberately trades a marginally lower best-val for lower variance, rather than chasing the minimum mean val_rmse alone.
Recovering other snapshots: the absolute best-val epoch (usually epoch 10) scores marginally lower mean val_rmse (FT-hill ~10.5β10.6, GRANDE-hill ~11.3β11.6). To pull it instead, fetch
cpinkl/pmlr-sctherapy/model-<runid>:best(aliasbestβ epoch 10). Any other epoch:model-<runid>:vNwhere v0=epoch1 β¦ v9=epoch10.
LightGBM (checkpoints/lgbm/)
| Path | Notes |
|---|---|
cell_seed42_ts0.8_nb1000_es50/model.txt |
Cell-split baseline (held-out cell lines), 1000 rounds, ES@50. Headline LGBM used in the registry. Val RMSE β 6.96, AUC β 0.70 on the 12-patient benchmark. |
lgbm_baseline_model.txt |
Alternative local LGBM. |
Usage
The neural .pt files are PyTorch state-dicts ({model_state_dict, epoch, val_rmse}) β weights only, no architecture. The repo's eval/model_registry.py
re-infers the architecture from the tensor shapes, so loading needs nothing but
the checkpoint and a registry key (ft_scalar, ft_hill, grande_scalar,
grande_hill, or lgbm β the filename tells you which).
import torch, numpy as np
from huggingface_hub import hf_hub_download
from eval.model_registry import REGISTRY, load_predictor
# Download one of the archived checkpoints (best FT-hill, seed 43, epoch 7)
ckpt = hf_hub_download(
"Tino3141/sctherapy-artifacts",
"checkpoints/neural/ft_transformer_hill__s43__epoch7__run-k36em957.pt",
)
pred = load_predictor(
REGISTRY["ft_hill"], # key must match the checkpoint's arch/head
device=torch.device("cpu"),
checkpoint_override=ckpt,
)
# Inputs: gene z-scores (N, 978), ECFP4 bits (N, 1024), dose in Β΅M (N,)
gene = np.zeros((2, 978), dtype=np.float32)
ecfp4 = np.zeros((2, 1024), dtype=np.float32)
dose = np.array([1.0, 10.0], dtype=np.float32)
y = pred.predict(gene, ecfp4, dose) # β predicted % inhibition, shape (N,)
print(y)
LightGBM (model.txt) loads directly:
from src.model.lgbm import LGBMDrugPredictor
lgbm = LGBMDrugPredictor.load("cell_seed42_ts0.8_nb1000_es50/model.txt")
# feature layout: np.hstack([gene(978), ecfp4(1024), dose(1)])
The AML patient inputs are in eval/aml_eval/ (model_inputs, with the
ground-truth DSS to score against); the separate multi-cancer held-out set is in
eval/zenodo_eval/; derived/gene_names.json gives the gene column order and
derived/lincs_gene_stats.npz the (mean, std) used to z-score new expression
data. To download everything at once:
hf download Tino3141/sctherapy-artifacts --local-dir ./sctherapy-artifacts.
Derived data (derived/)
Small generated files needed to reproduce inference (not the raw sources):
| File | What |
|---|---|
gene_names.json |
978 LINCS landmark gene symbols, in feature order |
feature_names.json |
Full flat-feature names `[genes |
lincs_gene_stats.npz |
Per-gene LINCS (mean, std) for z-scoring patient inputs |
matched_samples.parquet |
LINCSΓPharmacoDB matched (cell, drug) rows |
training_data.parquet |
Full training set β 3.24M rows [sig_id, cell_iname, smiles, dose, inhibition, gene_expression(978), ecfp4(1024)] (1.5 GB) |
The full training parquet is included here so the repo is self-contained; it is
also mirrored at Tino3141/lincs-pharmaco-training / Tino3141/pharmaco_flat.
Eval datasets (eval/) β two separate cohorts
eval/aml_eval/ β the 12-patient AML benchmark
The primary patient benchmark (scTherapy paper; consolidated from the public
Tino3141/aml12-drug-response-eval dataset). Patients patient1 β¦ patient12.
| Path | What |
|---|---|
ground_truth/dss.csv / .parquet |
Ground-truth Drug Sensitivity Scores per (patient, drug) β the eval target |
ground_truth/supplementary_scTherapy.xlsx |
scTherapy supplementary Excel the DSS were extracted from |
model_inputs/full_inputs.parquet |
Assembled (patient Γ drug) model inputs |
model_inputs/patient_deg_vectors.parquet |
978-gene log2FC vector per patient |
model_inputs/drug_fingerprints.parquet |
1024-bit ECFP4 per drug |
model_inputs/example_full_input.parquet |
Small worked example of the input layout |
degs/ |
Differential-expression gene sets per patient (raw + filtered + meta) |
eval/zenodo_eval/ β multi-cancer held-out set (NOT AML)
A separate single-cell drug-response set from public GEO/Zenodo studies β HNSCC
(SCC47, JHU006, HN120/137), lung (PC9, H1975), CLL, HCC, breast (FCIBC02), etc.,
across 10 drugs. Each pseudo-bulk carries a binary sensitive/resistant label.
Built from the ../tino/*.pt source (named after the author).
| Path | What |
|---|---|
expression_zscore.parquet |
Dose-expanded expression, LINCS z-score (built by scripts/build_zenodo_zscore.py) |
expression_logfc.parquet |
Dose-expanded expression, LINCS log2FC (built by scripts/build_zenodo_logfc.py) |
eval/reference/ β shared lookups
| Path | What |
|---|---|
lincs_landmark_genes.csv |
The 978 LINCS landmark genes |
drug_name_to_smiles.json |
Drug-name β canonical SMILES map |
Each folder has its own README.md. Raw AML patient .RDS files (~5 GB) are
not here β re-downloadable from
Zenodo 13340927.
Results (results/)
Per-drug / per-patient prediction CSVs, metric tables, and plots from the headline runs (exploratory/superseded variants were pruned):
| Dir | Cohort | What |
|---|---|---|
aml_evals/ |
AML | per-arch/head/seed AML predictions β the main neural-vs-LGBM comparison (filenames encode the epoch used) |
aml_lgbm_seed42/ |
AML | LGBM baseline AML eval (summary, per-drug, per-patient AUC) |
zenodo_cluster/ |
Zenodo | run with all 5 models in one aggregated.csv + ROC plots |
zenodo_logfc_all_final_final/ |
Zenodo | final full Zenodo-cohort run |
zenodo_logfc_within_sweep/ |
Zenodo | within-file hyperparameter sweep grid |
split_comparison/ |
β | random vs cell vs drug vs cell_drug split analysis (the data-leakage finding) |