Boltz-2 ONNX (single-sequence v0)
ONNX-Runtime-compatible export of Boltz-2 (Wohlwend et al., 2024β2025; MIT-licensed), produced by boltz-dev. Split into two graphs so the inference loop (recycling + diffusion sampling) can run client-side β designed for biocircus.io, which loads these via ONNX Runtime Web + WebGPU and lets a user predict a structure entirely inside the browser tab.
This is v0: single-sequence protein only, no MSA, no templates, no affinity head. Confidence and full-atom output included. Three precision tiers.
What's here
| Tier | Trunk | Diffusion step | Total | When to use |
|---|---|---|---|---|
| fp32 | 795 MB | 1.05 GB | 1.85 GB | Reference / debug |
| fp16 | 409 MB | 538 MB | 947 MB | Desktop / laptop default β recommended |
| int8 | 217 MB | 273 MB | 490 MB | Smartphone / tablet / low-RAM |
fp32/ fp16/ int8/
βββ trunk.onnx βββ trunk_fp16.onnx βββ trunk_int8.onnx
βββ trunk.onnx.data βββ trunk_fp16.onnx.data βββ trunk_int8.onnx.data
βββ diffusion_step.onnx βββ diffusion_step_fp16.onnx βββ diffusion_step_int8.onnx
βββ diffusion_step.onnx.data βββ diffusion_step_fp16.onnx.data βββ diffusion_step_int8.onnx.data
Each .onnx file is paired with an external-data sidecar (.onnx.data) holding the bulk of the weights. ONNX Runtime loads the pair automatically when both are co-located.
Graph signatures
Trunk (one recycling pass)
Inputs (80 total): 78 preprocessed feature tensors from the Boltz-2 input pipeline β token_pad_mask, atom_pad_mask, ref_pos, ref_element, res_type, residue_index, token_bonds, type_bonds, mol_type, entity_id, sym_id, etc. β plus s_prev [B, N, 384] and z_prev [B, N, N, 128] (zeros on the first recycling iteration; trunk outputs fed back on subsequent ones).
Outputs (9 total):
s [B, N, 384]β single representationz [B, N, N, 128]β pair representationpdistogram [B, N, N, 1, 64]β distogram logitsq [B, A, 128],c [B, A, 128]β atom-level diffusion conditioningatom_enc_bias [B, K, W, H, 12],atom_dec_bias [B, K, W, H, 12],token_trans_bias [B, N, N, 384]β diffusion biasess_inputs [B, N, 384]β input embedder output, reused by the diffusion step
Diffusion step (one denoising iteration)
Inputs (87 total): the same 78 feature tensors plus the 8 trunk-cached tensors (s, s_inputs, q, c, three biases) + x_noisy [B, A, 3] (current atom coords) + sigma [B] (1-D tensor; the t_hat = sigma_tm * (1 + gamma) value computed in the orchestrator).
Output: x_denoised [B, A, 3].
to_keys (used by the atom encoder) is not part of the graph contract β it's a functools.partial closure in PyTorch land. Reconstruct it inside the diffusion graph by computing get_indexing_matrix(K = A_padded / W, W, H, device) with the known model hyperparams W=32, H=128 (and equivalently for any other atom-window split).
Validation
Per the Phase 5 report, all three precisions sit inside the natural PyTorch inter-seed noise distribution for 1CRN single-sequence inference. Pairwise Kabsch-aligned CΞ± RMSD (Γ ):
PyTorch β· PyTorch ORT β· PyTorch
inter-seed pairs: mean 6.77, range 5.69β8.06
fp32 vs PyTorch: mean 6.74, range 4.94β8.99
fp16 vs PyTorch: mean 6.73, range 4.95β8.97
int8 vs PyTorch: mean 6.20, range 4.97β8.01
Cross-precision (ORT vs ORT, same seed):
fp32 β fp16: 0.186 β essentially lossless
fp32 β int8: 2.31 β within ORT cross-augmentation drift
fp16 β int8: 2.27
The high absolute RMSDs are intrinsic to single-sequence-no-MSA Boltz-2 (pLDDT ~0.45 on this prediction β the model is sampling broadly because it has no co-evolution signal). With MSA inputs the noise floor collapses ~10Γ; quantisation quality would still hold.
Use in biocircus.io
The biocircus runtime fetches a ModelManifest that points at this repo, picks a precision tier based on hardware probe, and runs the recycling + diffusion sampling loops in TypeScript against ORT Web sessions. See boltz_orchestrate.py for a Python reference implementation of the orchestration loop.
For the per-step diffusion math the orchestrator needs:
- A Haar-uniform random-rotation generator (called
compute_random_augmentationin Python). - A Kabsch-style weighted rigid align (used when
alignment_reverse_diff=True, which it is by default in Boltz-2). - The Karras noise schedule (formula and constants in
phase4/validation_report.md).
License & citation
MIT, matching the upstream Boltz repo. Please cite Boltz-2:
@article{wohlwend2025boltz2,
title = {Boltz-2: Towards Accurate and Efficient Binding Affinity Prediction},
author = {Wohlwend, J. and others},
year = {2025},
doi = {10.1101/2025.06.14.659707}
}
Provenance
Exported with torch.onnx.export(..., dynamo=True) on PyTorch 2.12, ONNX opset 18. Quantised with onnxconverter_common.float16 (fp16) and onnxruntime.quantization.quantize_dynamic (int8). See the boltz-dev repo's scripts/ directory for the export and quantisation scripts, and the EXPORT_PLAN.md pitfall catalogue for documented onnxconverter_common workarounds on dynamo-exported graphs.
Model tree for latentspacecraft/boltz-2-onnx
Base model
boltz-community/boltz-2