FuXi-2.1

FuXi-2.1 is a global, deterministic machine-learning weather forecasting model developed by Fudan University & SAIS. It produces global forecasts at 0.25° resolution, on 6-hourly steps, out to 10 days.

FuXi-2.1 targets the defining failure mode of data-driven weather prediction: forecasts that blur into a smooth spatial average as lead time grows, erasing the small-scale structure that matters most for extremes. FuXi-2.1 produces markedly sharper fields whose spatial power spectra track observations across the full wavenumber range, while keeping deterministic skill (RMSE) comparable to FuXi-1.0 — and substantially improving extreme-event detection for heavy precipitation and strong wind.

This model is released as part of the FuXi Single collection.

What's new in 2.1
Quickstart
Model overview
Data details
Evaluation
Known limitations
Citation

What's new in 2.1

Relative to FuXi-1.0, FuXi-2.1 introduces:

A flat Transformer backbone replacing FuXi-1.0's U-Transformer (ResNet downsample → Swin Transformer → upsample). FuXi-2.1 drops the U-shaped up/down-sampling in favour of a single full-resolution Transformer trunk.
Rotary position embeddings (RoPE) inside the Swin windowed attention, replacing the learned relative-position bias used in FuXi-1.0.
adaLN time conditioning that injects time-period information — forecast lead step, time-of-day and day-of-year phase — into every block, inspired by diffusion transformers in image generation.
A variable-aware multi-head decoder that gives pressure-level, surface and derived variables their own specialised output heads.

The combined effect is sharper, spectrally faithful forecasts with no penalty on mean-error skill.

Quickstart

This repository ships the exported model (fuxi-2.1.pt2), normalization statistics (mean.nc, std.nc), a sample pre-normalized input (input.nc), and minimal inference code.

# 1. Install dependencies
pip install -r requirements.txt

# 2. End-to-end demo (inference + plots)
bash run.sh --model_dir . --input input.nc --steps 5

# Or run inference directly (40 steps = 10-day forecast):
python inference.py \
    --model_dir . \
    --input input.nc \
    --output_dir ./output \
    --steps 40 \
    --forecast_time 2024092900

# 3. Plot selected channels
python plot.py --output_dir ./output --channels t2m z500 tp --discrete

Input — a NetCDF with a variable input of shape (time=2, channel=85, lat=721, lon=1440), z-score normalized, coordinates lat 90→−90 and lon 0→359.75. The provided input.nc is a sample for 2024-09-29 00Z.

Output — each step saved as {output_dir}/{step:03d}.nc, shape (channel=85, lat=721, lon=1440) in physical units (denormalized), with a valid_time attribute. Steps are 1-based: 001.nc = +6 h, … 040.nc = +240 h (10 days).

GPU: the device is baked into the exported graph; load on CUDA. ~8 GB GPU memory is enough (model ~4 GB + recurrent state ~1.4 GB + working memory). Tested on A100, V100, RTX 3090/4090. See variables.py for the full ordered channel list.

Model overview

Model description

FuXi-2.1 is a single Transformer. The global atmospheric state is split into patches and embedded into tokens, processed by a stack of windowed-attention blocks, and read out by a variable-aware multi-head decoder. The model is deterministic — one forward pass per step, with no adversarial or diffusion sampling at inference — and is rolled out autoregressively at 6-hourly steps.

Developed by: Fudan University & SAIS
Model type: Transformer (patch-embed → Swin attention with RoPE + adaLN → multi-head decoder)
Forecast type: Global, deterministic, autoregressive
License: CC BY 4.0
Predecessor: FuXi-1.0

Architecture details

Component	Specification
Backbone	Single Transformer trunk (no U-Net up/down-sampling)
Attention	Swin windowed attention
Position encoding	Rotary (RoPE, 1-D)
Normalisation / conditioning	adaLN, conditioned on lead step, time-of-day, day-of-year
Feed-forward	SwiGLU
Decoder	Variable-aware multi-head (pressure / surface / derived)
Input frames	2 (states at t−6h and t₀)
Output	State at t+6h, rolled out autoregressively

Model resolution

Model	Horizontal resolution	Vertical resolution [pressure levels] (hPa)
FuXi-2.1	0.25° (721×1440)	13: 50, 100, 150, 200, 250, 300, 400, 500, 600, 700, 850, 925, 1000

Data details

Training data

FuXi-2.1 is trained and evaluated on ERA5 reanalysis at 0.25° resolution, 6-hourly.

Training period: 2002–2023
Test period: 2024 (held out)

Data parameters

FuXi-2.1 operates on 85 channels per time step: 65 pressure-level channels (5 variables × 13 levels) and 20 surface channels, plus static forcings supplied as constant inputs. Most channels are prognostic — the same channels are input and output and fed back during roll-out. Radiation fluxes and total precipitation are diagnostic outputs produced through a dedicated decoder head (they are predicted but not fed back as inputs).

Channel order (exact): the 65 pressure-level channels first (z, then t, u, v, q, each over the 13 levels 50→1000 hPa), followed by the 20 surface channels: msl, t2m, d2m, sst, ws10m, ws100m, u10m, v10m, u100m, v100m, lcc, mcc, hcc, tcc, ssr, ssrd, fdir, ttr, tcw, tp.

Pressure-level parameters (13 levels: 50–1000 hPa)

Short name	Name	Units	Input/Output
z	Geopotential	m²·s⁻²	Both (prognostic)
t	Temperature	K	Both (prognostic)
u	Eastward wind	m·s⁻¹	Both (prognostic)
v	Northward wind	m·s⁻¹	Both (prognostic)
q	Specific humidity	kg·kg⁻¹	Both (prognostic)

Surface parameters (20)

Short name	Name	Units	Input/Output
msl	Mean sea-level pressure	Pa	Both (prognostic)
t2m	2 m temperature	K	Both (prognostic)
d2m	2 m dewpoint temperature	K	Both (prognostic)
sst	Sea-surface temperature	K	Both (prognostic)
ws10m	10 m wind speed	m·s⁻¹	Both (prognostic)
ws100m	100 m wind speed	m·s⁻¹	Both (prognostic)
u10m	10 m eastward wind	m·s⁻¹	Both (prognostic)
v10m	10 m northward wind	m·s⁻¹	Both (prognostic)
u100m	100 m eastward wind	m·s⁻¹	Both (prognostic)
v100m	100 m northward wind	m·s⁻¹	Both (prognostic)
lcc	Low cloud cover	0–1	Both (prognostic)
mcc	Medium cloud cover	0–1	Both (prognostic)
hcc	High cloud cover	0–1	Both (prognostic)
tcc	Total cloud cover	0–1	Both (prognostic)
tcw	Total column water	kg·m⁻²	Both (prognostic)
ssr	Surface net solar radiation	J·m⁻²	Output (diagnostic)
ssrd	Surface solar radiation downwards	J·m⁻²	Output (diagnostic)
fdir	Total-sky direct solar radiation at surface	J·m⁻²	Output (diagnostic)
ttr	Top net thermal radiation	J·m⁻²	Output (diagnostic)
tp	Total precipitation	mm	Output (diagnostic)

Field	Level type	Input/Output
Land-sea mask, orography/geopotential, latitude/longitude encodings, time-of-day / day-of-year	Surface / static	Input (forcings)

Evaluation

We compare FuXi-2.1 against FuXi-1.0 under an identical protocol: forecasts initialised from ERA5 and rolled out to 240 h in 6-hour steps. CSI is computed over land only, globally. These numbers come from a limited set of sample cases, not a full-year evaluation — they are indicative, and broader scorecards will follow.

Headline: RMSE stays comparable to FuXi-1.0 across variables, while structural and extreme-event scores improve substantially.

Precipitation — Critical Success Index (CSI)

Threshold	FuXi-1.0	FuXi-2.1	Δ
≥ 5 mm	0.265	0.284	+7.3%
≥ 20 mm	0.131	0.146	+11.4%
≥ 50 mm	0.074	0.084	+13.4%
≥ 100 mm	0.014	0.024	+68.3%

10 m wind speed — Critical Success Index (CSI)

Threshold	FuXi-1.0	FuXi-2.1	Δ
≥ 10.8 m·s⁻¹	0.544	0.571	+4.8%
≥ 24.5 m·s⁻¹	0.165	0.198	+20.3%
≥ 28.5 m·s⁻¹	0.000	0.044	newly resolved

The relative gain grows with event intensity, peaking at the extreme tail. At the 28.5 m·s⁻¹ wind threshold FuXi-1.0 scores zero — it never predicts such winds — whereas FuXi-2.1 attains a non-zero CSI. Spatial power spectra of FuXi-2.1 track the observed spectra across the full wavenumber range, in contrast to FuXi-1.0's high-wavenumber energy deficit.

Known limitations

FuXi-2.1 is a deterministic model; it does not provide a calibrated ensemble spread.
The CSI numbers reported here are computed on land only, over a limited set of sample cases rather than a full-year evaluation; treat them as indicative. Comprehensive global scorecards will be added.
As with all ERA5-trained models, skill depends on the quality and resolution of the initial conditions.

Citation

If you use FuXi-2.1, please cite the FuXi series:

@article{chen2023fuxi,
  title   = {FuXi: a cascade machine learning forecasting system for 15-day global weather forecast},
  author  = {Chen, Lei and Zhong, Xiaohui and Zhang, Feng and Cheng, Yuan and Xu, Yimin and Qi, Yan and Li, Hao},
  journal = {npj Climate and Atmospheric Science},
  year    = {2023},
  volume  = {6},
  number  = {1},
  pages   = {190}
}

Code: FuXi-1.0 — https://github.com/tpys/FuXi

Downloads last month: -; Downloads are not tracked for this model. How to track

Collection including tpys/fuxi-2.1

FuXi single

Collection

1 item • Updated 1 day ago

tpys
/

fuxi-2.1