TimesFM From Scratch (70M)

A from-scratch reimplementation of Google's TimesFM time-series foundation model, trained with a custom pretraining pipeline rebuilt from the paper (arXiv:2310.10688). This checkpoint is the 70M (small) configuration.

Code: https://github.com/FareedKhan-dev/timesfm-from-scratch
Full engineering write-up: see docs/PROJECT_JOURNEY.md in the code repo.
Blog: Building a 200M Parameter Time-Series Forecasting LLM From Scratch

It takes the recent history of a single numeric series (for example the last 512 points) and predicts the next 96 to 192 values, zero-shot, with a point forecast and 9 quantiles.

Results (honest)

Zero-shot ETT (never trained on), standardized MAE, even-window protocol:

Model	Params	ETT MAE	vs naive	vs seasonal-naive
Last-value naive	n/a	0.298	1.00
Seasonal-naive	n/a	0.272	0.91	1.00
This model	70M	0.246	0.83	0.90
Google TimesFM (released)	200M	0.215 (published)	reference

The improvement over both baselines is statistically significant under a paired bootstrap test. It does not overfit (flat through 120k steps) and generalizes across structured domains.

Honest limitations

Fails on random-walk data (FX is about 31% worse than naive). Not a good raw stock or currency price predictor.
Native intervals are over-confident; calibrated intervals use a post-hoc conformal step.
Below state of the art (0.246 vs Google's 0.215). The gap is bound by model scale and data.

Usage

# 1) get the code: https://github.com/FareedKhan-dev/timesfm-from-scratch  (add src/ to your path)
# 2) load the weights:
import torch
from huggingface_hub import hf_hub_download
from tsfm import config
from tsfm.model import build_model

ckpt = hf_hub_download("FareedKhan/timesfm-from-scratch-70m", "pytorch_model.pt")
model = build_model(config.small())
model.load_state_dict(torch.load(ckpt, map_location="cpu")["model"])
model.eval()

context = torch.randn(1, 512)                  # your last 512 points, standardized
point, quantiles = model.forecast(context, 96) # next 96 steps: point + 9 quantiles

License

MIT. Reimplements ideas from TimesFM (Google) for research and education; not affiliated with Google.

Downloads last month: 15

Safetensors

Model size

67.8M params

Tensor type

F32

Inference Providers NEW

Time Series Forecasting

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for FareedKhan/timesfm-from-scratch-70m

A decoder-only foundation model for time-series forecasting

Paper • 2310.10688 • Published Oct 14, 2023 • 37