Boltz-2 ONNX (single-sequence v0)

ONNX-Runtime-compatible export of Boltz-2 (Wohlwend et al., 2024–2025; MIT-licensed), produced by boltz-dev. Split into two graphs so the inference loop (recycling + diffusion sampling) can run client-side β€” designed for biocircus.io, which loads these via ONNX Runtime Web + WebGPU and lets a user predict a structure entirely inside the browser tab.

This is v0: single-sequence protein only, no MSA, no templates, no affinity head. Confidence and full-atom output included. Three precision tiers.

What's here

Tier Trunk Diffusion step Total When to use
fp32 795 MB 1.05 GB 1.85 GB Reference / debug
fp16 409 MB 538 MB 947 MB Desktop / laptop default β€” recommended
int8 217 MB 273 MB 490 MB Smartphone / tablet / low-RAM
fp32/                          fp16/                          int8/
β”œβ”€β”€ trunk.onnx                 β”œβ”€β”€ trunk_fp16.onnx            β”œβ”€β”€ trunk_int8.onnx
β”œβ”€β”€ trunk.onnx.data            β”œβ”€β”€ trunk_fp16.onnx.data       β”œβ”€β”€ trunk_int8.onnx.data
β”œβ”€β”€ diffusion_step.onnx        β”œβ”€β”€ diffusion_step_fp16.onnx   β”œβ”€β”€ diffusion_step_int8.onnx
└── diffusion_step.onnx.data   └── diffusion_step_fp16.onnx.data └── diffusion_step_int8.onnx.data

Each .onnx file is paired with an external-data sidecar (.onnx.data) holding the bulk of the weights. ONNX Runtime loads the pair automatically when both are co-located.

Graph signatures

Trunk (one recycling pass)

Inputs (80 total): 78 preprocessed feature tensors from the Boltz-2 input pipeline β€” token_pad_mask, atom_pad_mask, ref_pos, ref_element, res_type, residue_index, token_bonds, type_bonds, mol_type, entity_id, sym_id, etc. β€” plus s_prev [B, N, 384] and z_prev [B, N, N, 128] (zeros on the first recycling iteration; trunk outputs fed back on subsequent ones).

Outputs (9 total):

  • s [B, N, 384] β€” single representation
  • z [B, N, N, 128] β€” pair representation
  • pdistogram [B, N, N, 1, 64] β€” distogram logits
  • q [B, A, 128], c [B, A, 128] β€” atom-level diffusion conditioning
  • atom_enc_bias [B, K, W, H, 12], atom_dec_bias [B, K, W, H, 12], token_trans_bias [B, N, N, 384] β€” diffusion biases
  • s_inputs [B, N, 384] β€” input embedder output, reused by the diffusion step

Diffusion step (one denoising iteration)

Inputs (87 total): the same 78 feature tensors plus the 8 trunk-cached tensors (s, s_inputs, q, c, three biases) + x_noisy [B, A, 3] (current atom coords) + sigma [B] (1-D tensor; the t_hat = sigma_tm * (1 + gamma) value computed in the orchestrator).

Output: x_denoised [B, A, 3].

to_keys (used by the atom encoder) is not part of the graph contract β€” it's a functools.partial closure in PyTorch land. Reconstruct it inside the diffusion graph by computing get_indexing_matrix(K = A_padded / W, W, H, device) with the known model hyperparams W=32, H=128 (and equivalently for any other atom-window split).

Validation

Per the Phase 5 report, all three precisions sit inside the natural PyTorch inter-seed noise distribution for 1CRN single-sequence inference. Pairwise Kabsch-aligned CΞ± RMSD (Γ…):

                  PyTorch ⟷ PyTorch       ORT ⟷ PyTorch
  inter-seed pairs:  mean 6.77, range 5.69–8.06
  fp32  vs  PyTorch: mean 6.74, range 4.94–8.99
  fp16  vs  PyTorch: mean 6.73, range 4.95–8.97
  int8  vs  PyTorch: mean 6.20, range 4.97–8.01

  Cross-precision (ORT vs ORT, same seed):
  fp32  ↔  fp16:  0.186   ← essentially lossless
  fp32  ↔  int8:  2.31    ← within ORT cross-augmentation drift
  fp16  ↔  int8:  2.27

The high absolute RMSDs are intrinsic to single-sequence-no-MSA Boltz-2 (pLDDT ~0.45 on this prediction β€” the model is sampling broadly because it has no co-evolution signal). With MSA inputs the noise floor collapses ~10Γ—; quantisation quality would still hold.

Use in biocircus.io

The biocircus runtime fetches a ModelManifest that points at this repo, picks a precision tier based on hardware probe, and runs the recycling + diffusion sampling loops in TypeScript against ORT Web sessions. See boltz_orchestrate.py for a Python reference implementation of the orchestration loop.

For the per-step diffusion math the orchestrator needs:

  • A Haar-uniform random-rotation generator (called compute_random_augmentation in Python).
  • A Kabsch-style weighted rigid align (used when alignment_reverse_diff=True, which it is by default in Boltz-2).
  • The Karras noise schedule (formula and constants in phase4/validation_report.md).

License & citation

MIT, matching the upstream Boltz repo. Please cite Boltz-2:

@article{wohlwend2025boltz2,
  title  = {Boltz-2: Towards Accurate and Efficient Binding Affinity Prediction},
  author = {Wohlwend, J. and others},
  year   = {2025},
  doi    = {10.1101/2025.06.14.659707}
}

Provenance

Exported with torch.onnx.export(..., dynamo=True) on PyTorch 2.12, ONNX opset 18. Quantised with onnxconverter_common.float16 (fp16) and onnxruntime.quantization.quantize_dynamic (int8). See the boltz-dev repo's scripts/ directory for the export and quantisation scripts, and the EXPORT_PLAN.md pitfall catalogue for documented onnxconverter_common workarounds on dynamo-exported graphs.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for latentspacecraft/boltz-2-onnx

Quantized
(1)
this model