lighteternal's picture
Clarify score semantics and keep stable public model binding
f230c49 verified
metadata
title: BioAssayAlign Compatibility Explorer
emoji: 🧪
colorFrom: green
colorTo: red
sdk: gradio
sdk_version: 6.9.0
python_version: '3.10'
app_file: app.py
pinned: false
license: mit
short_description: Rank a candidate molecule list against a bioassay.

BioAssayAlign Compatibility Explorer

BioAssayAlign is an assay-conditioned molecule ranking tool.

You provide:

  • a bioassay description and optional metadata
  • a list of candidate SMILES

The model returns:

  • a ranked list of molecules
  • a compatibility score for each one
  • explicit flags for invalid SMILES

What It Is

This is not a chatbot. It is not a potency predictor.

It is a ranking model trained on a frozen public bioassay dataset built from PubChem BioAssay and ChEMBL. It is designed to answer:

“Given this assay, which molecules should I look at first?”

What The Score Means

  • The app shows a priority band and a list-relative score first.
  • Those values explain the ranking better than the raw model score.
  • The raw score is not a probability. It is an uncalibrated ranking value from the scorer head.
  • The strongest molecule in your submitted list will be near the top of the 0–100 relative scale.

How To Use It

  1. Enter the assay title and description in plain scientific language.
  2. Add metadata if you know it:
    • organism
    • readout
    • assay format
    • assay type
    • target UniProt ID
  3. Paste one SMILES per line or upload a CSV with a smiles column.
  4. Run ranking.
  5. Read the output in this order:
    • priority
    • relative score
    • chemistry context columns (MolWt, logP, TPSA)
    • raw model score only if needed

Recommended Input Style

The model is most reliable when assay information is provided as structured fields:

  • title
  • description
  • organism
  • readout
  • assay format
  • assay type
  • target UniProt IDs

You can paste SMILES directly or upload a CSV with a smiles or canonical_smiles column.

Good Uses

  • ranking a screening shortlist for a new assay concept
  • triaging compounds before a more expensive downstream model or wet-lab step
  • testing how sensitive rankings are to assay wording and metadata

Example Assays Included In The UI

  • JAK2 cell assay
  • ALDH1A1 fluorescence assay
  • BTK binding quick check

These examples call the live model. They are not screenshots or mocked outputs.

Limits

  • This is a public-data model, not a medicinal chemistry oracle.
  • It does not predict IC50 directly.
  • It is strongest as a relative ranking tool over a candidate list you already care about.

Runtime Notes

  • The first request can be slower because the Space warms the model in the background.
  • Large candidate lists increase runtime. For interactive use, start with a few hundred molecules.

Model

The Space reads the model repo from the MODEL_REPO_ID environment variable.

Default:

  • lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compatibility

If the champion changes later, the Space can point to a new model repo without changing the UI.