STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

Trained STRIDE steering operators and the nanochat base checkpoints they attribute, for the four pre-training models in the paper. STRIDE attributes a model's prediction back to the pre-training examples that shaped it, by learning a tiny activation-steering operator instead of retraining.

depth	params	base ckpt step	operator	LDS (Spearman)
d12	286M	1680	layer 8, rank 16	0.156
d16	537M	3584	layer 10, rank 16	0.177
d20	897M	3320	layer 12, rank 16	0.158
d24	1.38B	5568	layer 15, rank 16	0.165

Layout

base_checkpoints/<tag>/model_<step>.pt   # nanochat base checkpoint
base_checkpoints/<tag>/meta_<step>.json  # nanochat config
operators/<tag>/operator.pt     # trained SteeringOperator state dict
operators/<tag>/subsets.npy     # subset membership (K=1000, d=10)
operators/<tag>/meta.json       # training config + n_train + operator dims
tokenizer/                      # shared nanochat tokenizer

Usage

from stride.inference import Stride
attr = Stride.from_pretrained("d12")
result = attr.attribute(my_queries)

The LDS ground-truth losses and the held-out test set live in the dataset repo rishitdagli/stride-lds.

Citation

@misc{dagli2026stridetrainingdataattribution,
      title={STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations}, 
      author={Rishit Dagli and Abir Harrasse and Luke Zhang and Florent Draye and Amirali Abdullah and Bernhard Schölkopf and Zhijing Jin},
      year={2026},
      eprint={2606.05165},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2606.05165}, 
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including CausalNLP/stride-nanochat

STRIDE - Training Data Attribution

Collection

4 items • Updated about 5 hours ago

Paper for CausalNLP/stride-nanochat

STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

Paper • 2606.05165 • Published 29 days ago • 4