Relational Transformer β€” PluRel Checkpoints

Relational Transformer (RT) model checkpoints pretrained on synthetic relational databases generated by PluRel.

Relational Transformer is a foundation model architecture for relational data that enables zero-shot transfer across heterogeneous schemas and tasks. It was introduced in:

Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data
Rishabh Ranjan, Valter Hudovernik, Mark Znidar, Charilaos Kanatsoulis, Roshan Upendra, Mahmoud Mohammadi, Joe Meyer, Tom Palczewski, Carlos Guestrin, Jure Leskovec β€” arXiv:2510.06377 (ICLR 2026)

The checkpoints provided in this repository were trained using the methodology described in:

PluRel: Synthetic Data unlocks Scaling Laws for Relational Foundation Models
Kothapalli, Ranjan, Hudovernik, Dwivedi, Hoffart, Guestrin, Leskovec β€” arXiv:2602.04029 (2026)

arXiv (RT) GitHub (RT) arXiv (PluRel) Project Page (PluRel) GitHub (PluRel) Dataset


Model Architecture

The Relational Transformer operates on multi-tabular relational databases, treating rows across linked tables as a sequence via BFS-ordered context sampling. It utilizes a Relational Attention mechanism over columns, rows, and primary-foreign key links.

Hyperparameter Value
Transformer blocks 12
Model dimension (d_model) 256
Attention heads 8
FFN dimension (d_ff) 1,024
Context length 1,024 tokens
Text encoder all-MiniLM-L12-v2 (d_text = 384)
Max BFS width 128

The architecture and training loop build on the Relational Transformer codebase.


Download

Single checkpoint (Python) β€” fetch config.json alongside the weights; it carries the model architecture and is what the Hub uses to count downloads:

from huggingface_hub import hf_hub_download

config = hf_hub_download("stanford-star/rt-plurel", "config.json")
ckpt = hf_hub_download("stanford-star/rt-plurel", "synthetic-pretrain_rdb_1024_size_4b.pt")

Full repository (CLI):

hf download stanford-star/rt-plurel \
    --repo-type model \
    --local-dir ~/scratch/rt_hf_ckpts

RelBench leaderboard checkpoints (added 2026-06)

Protocols follow the repo's continued-pretraining script (50k steps, batch 128, lr 5e-4 cosine, from synthetic-pretrain_rdb_1024_size_4b.pt) and the RT example_finetune protocol (lr 1e-4, batch 32, 2^15+1 steps), with regression best-checkpoint selection by val NMAE (MAE / train-split std, ddof=1) β€” the leaderboard metric β€” instead of RΒ². Evaluation = full official test split.

  • cntd-pretrain_<db>_<task>.pt β€” synthetic+real continued pretraining, one leave-one-DB-out run per database (incl. rel-event), per-task best checkpoint. These produce the "PluRel | synthetic + real" zero-shot regression and rel-event cells.
  • finetune_<db>_<task>.pt β€” fine-tuned from the matching cntd-pretrain checkpoint (chosen over synthetic-only by val zero-shot, which the synthetic+real checkpoint won on every task). These produce the "PluRel | pretrained + fine-tuned" leaderboard row.
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train stanford-star/rt-plurel

Papers for stanford-star/rt-plurel