Stormer β€” Decay Finetuning Checkpoints

PyTorch Lightning checkpoints from a Stormer decay-finetuning sweep (h512_d20_n8: hidden size 512, depth 20, 8 heads).

Each subfolder corresponds to one run and contains:

  • checkpoints/epoch_*.ckpt β€” best checkpoint (monitored on val/w_mse_aggregate_72_hrs_ensemble_mean)
  • checkpoints/last.ckpt β€” last checkpoint (identical to the epoch checkpoint for every run except decay_h512_d20_n8_ep9_d1)
  • config.yaml β€” the full LightningCLI training config

Runs

Folder Epochs decay (d)
decay_h512_d20_n8_ep9_d1 9 1
decay_h512_d20_n8_ep18_d2 18 2
decay_h512_d20_n8_ep36_d4 36 4
decay_h512_d20_n8_ep54_d6 54 6
decay_h512_d20_n8_ep72_d8 72 8
decay_h512_d20_n8_ep90_d10 90 10
decay_h512_d20_n8_ep108_d12 108 12

Loading

import torch
from huggingface_hub import hf_hub_download

ckpt = hf_hub_download(
    "KyleBae1017/stormer-decay-checkpoints",
    "decay_h512_d20_n8_ep108_d12/checkpoints/last.ckpt",
)
state = torch.load(ckpt, map_location="cpu")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support