HSTU iter7 + iter8 Preservation (2026-06-07)

Full reproducibility archive for HSTU (Hierarchical Sequential Transduction Units, Meta 2024) experiments on MovieLens-20M and MovieLens-32M.

πŸ† Headline SOTA result

iter8-exp4 = NEW ml-20m Γ— BASE SOTA = NDCG@10 0.1948 (FULL canonical eval @ epoch 100)

  • +2.80% over HSTU-base paper baseline (0.1895)
  • Recipe: HSTU-base + PRISM-additive + Input Compression + Time Decay + linear_dropout=0.1 (the key discovery: low dropout matters)

πŸ“¦ Contents

Each <name>.tar.gz is a self-contained reproducible bundle with:

  • config/*.gin β€” full gin config + include chain
  • ckpts/HSTU-..._ep100 β€” resumable PyTorch checkpoint (model + optimizer + RNG + epoch + batch_id state)
  • tb/.../events.out.tfevents.* β€” full TensorBoard event files with per-epoch trajectory
  • MANIFEST.json β€” machine-readable metadata

Plus code-archive.tar.gz containing the full forked buck-iter2 codebase with iter2 patches (resumable ckpts, EVAL2, in-memory dataset preloader, fbgemm compat fix) + iter7/iter8 patches (PRISM-additive, low-dropout, time-decay).

πŸš€ Quick reproduction

# Download the SOTA bundle
hf download tzchen07/hstu-iter78-preservation iter8-exp4-SOTA.tar.gz --local-dir .

# Download the code
hf download tzchen07/hstu-iter78-preservation code-archive.tar.gz --local-dir .
tar xzf code-archive.tar.gz
cd buck-iter2

# Set up env (Python 3.11 + torch 2.7.1+cu128 + fbgemm-gpu 1.2.0)
python3 -m venv .venv && source .venv/bin/activate
pip install torch==2.7.1 --index-url https://download.pytorch.org/whl/cu128
pip install fbgemm-gpu==1.2.0 torchrec gin-config tensorboard absl-py pandas numpy

# Stage data (preprocessed ml-20m npz files - see separate data archive)

# Run training (the SOTA recipe)
NCCL_TUNER_CONFIG_PATH=/shared/nccl_tuner.textproto \
CUDA_VISIBLE_DEVICES=0 PYTHONPATH=$(pwd) \
python3 generative_recommenders/github/main.py \
    --gin_config_file=generative_recommenders/github/configs/ml-20m/iter8-exp4-additive-lowdrop.gin \
    --master_port=12300

See PRESERVATION_REPORT.md for the full reproducibility audit (including data preprocessing notes, env spec, and per-experiment metadata).

πŸ“Š All 8 experiments

Bundle NDCG@10 (FULL canonical, ep100) Dataset Model Recipe
iter8-exp4-SOTA.tar.gz ⭐ 0.1948 (+2.80% paper) ml-20m BASE PRISM-additive + IC + TD + dropout=0.1
iter8-exp1-HardNeg.tar.gz 0.1920 (+1.32% paper) ml-20m BASE PRISM-additive + IC + TD + HardNeg
iter8-exp0-PopDebias.tar.gz 0.1900 (+0.26% paper) ml-20m BASE PRISM-additive + IC + TD + PopDebias
iter7-exp1-PRISM-FILM-PopDebias-SOTA.tar.gz 0.1912 (+0.90% paper) ml-20m BASE PRISM-FILM + IC + TD + PopDebias
iter7-exp2-PRISM-MoE.tar.gz 0.1895 (paper match) ml-20m BASE PRISM-MoE + IC + TD
iter7-exp0-PRISM-FILM-HardNeg.tar.gz 0.1880 (βˆ’0.79% paper) ml-20m BASE PRISM-FILM + IC + TD + HardNeg
iter7-exp5-l700.tar.gz 0.1901 ml-20m BASE PRISM-FILM + IC + TD, l=700
iter7-exp6-ml32m-base.tar.gz 0.1508 ml-32m BASE PRISM-additive + IC + TD (first ml-32m Γ— BASE canonical result)

βš–οΈ License & attribution

  • Code: MIT (matches Meta's HSTU OSS code)
  • Data: MovieLens (research use, see GroupLens license)
  • Trained weights: research artifacts; for reproducibility & verification

πŸ™ Citation

If you use this preservation in your work:

@misc{chen2026hstu_iter78,
  title  = {HSTU iter7+iter8 Preservation: PRISM-additive + low-dropout SOTA on MovieLens-20M},
  author = {Chen, Tony},
  year   = {2026},
  url    = {https://huggingface.co/tzchen07/hstu-iter78-preservation},
  note   = {NDCG@10 0.1948 (+2.80\% over HSTU-base paper baseline)}
}

Built on top of:

  • Zhai et al., 2024. "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (Meta HSTU)
  • Kang et al. 2018. "Self-Attentive Sequential Recommendation" (SASRec)

Preserved 2026-06-07 by Rovo Dev. Full bundle md5 checksums in MANIFEST.md.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support