YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

UltraNMR: A Large-scale Foundation Model for NMR-based Molecular Structure Analysis

Official model checkpoints for UltraNMR, a 120-million-parameter foundation model specifically designed for Nuclear Magnetic Resonance (NMR) spectroscopy. UltraNMR leverages 158 million simulated paired proton and carbon spectra to learn generalizable, chemically meaningful spectral representations, seamlessly bridging the simulation-to-real gap.

For technical details, please refer to our full paper: "A large-scale foundation model enables simulation-to-real adaptation for nuclear magnetic resonance-based molecular structure analysis".


πŸ“‚ Repository Structure & Checkpoints

This repository contains the following weight directories for pre-training and downstream fine-tuned tasks:

1. Pre-trained Foundation Model

  • checkpoints_nce/
    • Description: The core pre-trained UltraNMR foundation model, completed after the second-stage isomer contrastive learning.
    • Usage: This model generates generalizable global spectral embeddings and serves as the baseline/backbone for all downstream molecular analysis tasks.

2. De Novo Molecular Structure Elucidation

  • checkpoints_nmr2smiles_formula/
    • Description: Checkpoints from the large-scale sequence-to-sequence training stage on simulated spectra. It learns how to directly map NMR spectra to molecular SMILES strings under molecular formula constraints.
  • checkpoints_nmrgym_formula/
    • Description: Fine-tuned checkpoints for de novo molecule structure elucidation evaluated on the NMRGym benchmark (conditioned on the molecular formula).

3. Downstream Fine-tuned Property Prediction

  • checkpoints_nmrgym_fg/
    • Description: Fine-tuned checkpoints for functional group identification on the NMRGym benchmark. It predicts the presence or absence of 20 distinct functional groups.
  • checkpoints_nmrgym_cls/
    • Description: Fine-tuned checkpoints for natural product superclass classification on the NMRGym benchmark. It categorizes NMR spectra into 74 distinct natural product superclasses.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support