CrossAbSense — antibody developability oracles (v0.9)

Property-specific neural oracles that predict five biophysical developability assays for therapeutic IgGs from paired VH/VL sequences, combining frozen protein-language-model encoders (ESM-Cambrian, ProtT5) with configurable attention decoders.

Code: https://github.com/SimonCrouzet/CrossAbSense Dataset: GDPa1 (242 IgGs, Ginkgo Bioworks)

Each property folder (<PROPERTY>_<config-checksum>/) contains: final.ckpt (model trained on all data — used by predict.py), fold0-4.ckpt (5-fold CV checkpoints), config.yaml, and property.txt.

Performance (5-fold cluster-stratified CV, Spearman ρ)

Property	This release (v0.9)	Paper (Table 1)
HIC (hydrophobicity)	0.685	0.644
Titer (expression)	0.425	0.428
PR_CHO (polyreactivity)	0.461	0.475
AC-SINS (self-association)	0.420	0.475
Tm2 (thermostability)	0.442	0.387

⚠️ Important caveat (v0.9)

These weights were trained from the published configs but in an environment without BioPhi (OASis humanness) and ScaLoP available. Those two antibody-feature sources were substituted with sentinel values during training, so the feature inputs differ slightly from the paper runs. This mainly affects AC-SINS (~0.05 below paper); the other four properties match or exceed Table 1. A future v1.0 will retrain the feature-using properties with BioPhi/ScaLoP restored. Pin revision="v0.9" if you need exactly these weights.

Usage

pip install huggingface_hub
python scripts/download_models.py --revision v0.9        # final.ckpt only (add --folds for CV)
python src/predict.py --input inputs/my_seqs.csv --model models/HIC_3595cc57 --output preds.csv

By default only final.ckpt (+ small metadata) is downloaded; the 5 CV fold checkpoints are fetched only when you ask for them (--folds, or predict.py --use-cv/--fold).

Or let predict.py fetch on demand:

python src/predict.py --input inputs/my_seqs.csv --model HIC_3595cc57 --from-hf --output preds.csv

License

Apache-2.0, matching the CrossAbSense repository.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

SimonCrouzet
/

CrossAbSense

CrossAbSense — antibody developability oracles (v0.9)

Performance (5-fold cluster-stratified CV, Spearman ρ)

⚠️ Important caveat (v0.9)

Usage

License

Dataset used to train SimonCrouzet/CrossAbSense