GLACIER: Graph-Language Alignment for Chemical Inference and Exploration using Representations

GLACIER is a multimodal student-teacher foundation model designed for molecular property prediction. It integrates molecular graphs, SMILES strings, and physicochemical descriptors to learn rich molecular embeddings.

Sample Usage

Since this model uses a custom architecture, you need to download the repository files to load the model.

import torch 
from huggingface_hub import snapshot_download
import sys

# Download the repository to access custom model code
repo_dir = snapshot_download(repo_id="glacier-hf/GLACIER-100k-MiniMol")
sys.path.append(repo_dir)

from data.dataloader import SmilesMoleculeDataset, build_dataloader
from glacier_student import Glacier

# Load the pretrained GLACIER model
model = Glacier.from_pretrained("glacier-hf/GLACIER-100k-MiniMol")

# Prepare input data
dataset = SmilesMoleculeDataset(smiles=["Cn1c(=O)c2c(ncn2C)n(C)c1=O"])
dataloader = build_dataloader(dataset, batch_size=1)

model.eval()
batch = next(iter(dataloader))
with torch.no_grad():
    embedding = model(batch)
print(embedding)

GLACIER Model Files

  • dataloader: customized dataloader for multimodal learning
  • encoders: graph, text, and tabular encoders
  • fusion: Finsler geometry-aware fusion method
  • glacier_student: GLACIER model backbone and contrastive loss
  • utils: miscellaneous helper functions

Citation

@inproceedings{nguyen2026glacier,
  title={GLACIER: A Multimodal Student-Teacher Foundation Model for Molecular Property Prediction},
  author={Emily Nguyen and Yongchan Hong and Harsh Toshniwal and Yan Liu and Andreas Luttens},
  booktitle={Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (KDD ’26)},
  year={2026},
  publisher={ACM},
  doi={10.1145/3770855.3819032}
}
Downloads last month
19
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for glacier-hf/GLACIER-100k-MiniMol