FuXi-2.1

License Task Model

FuXi-2.1 is a global, deterministic machine-learning weather forecasting model developed by Fudan University & SAIS. It produces global forecasts at 0.25° resolution, on 6-hourly steps, out to 10 days.

FuXi-2.1 targets the defining failure mode of data-driven weather prediction: forecasts that blur into a smooth spatial average as lead time grows, erasing the small-scale structure that matters most for extremes. FuXi-2.1 produces markedly sharper fields whose spatial power spectra track observations across the full wavenumber range, while keeping deterministic skill (RMSE) comparable to FuXi-1.0 — and substantially improving extreme-event detection for heavy precipitation and strong wind.

This model is released as part of the FuXi Single collection.

Table of contents


What's new in 2.1

Relative to FuXi-1.0, FuXi-2.1 introduces:

  • A flat Transformer backbone replacing FuXi-1.0's U-Transformer (ResNet downsample → Swin Transformer → upsample). FuXi-2.1 drops the U-shaped up/down-sampling in favour of a single full-resolution Transformer trunk.
  • Rotary position embeddings (RoPE) inside the Swin windowed attention, replacing the learned relative-position bias used in FuXi-1.0.
  • adaLN time conditioning that injects time-period information — forecast lead step, time-of-day and day-of-year phase — into every block, inspired by diffusion transformers in image generation.
  • A variable-aware multi-head decoder that gives pressure-level, surface and derived variables their own specialised output heads.

The combined effect is sharper, spectrally faithful forecasts with no penalty on mean-error skill.


Quickstart

This repository ships the exported model (fuxi-2.1.pt2), normalization statistics (mean.nc, std.nc), a sample pre-normalized input (input.nc), and minimal inference code.

# 1. Install dependencies
pip install -r requirements.txt

# 2. End-to-end demo (inference + plots)
bash run.sh --model_dir . --input input.nc --steps 5

# Or run inference directly (40 steps = 10-day forecast):
python inference.py \
    --model_dir . \
    --input input.nc \
    --output_dir ./output \
    --steps 40 \
    --forecast_time 2024092900

# 3. Plot selected channels
python plot.py --output_dir ./output --channels t2m z500 tp --discrete

Input — a NetCDF with a variable input of shape (time=2, channel=85, lat=721, lon=1440), z-score normalized, coordinates lat 90→−90 and lon 0→359.75. The provided input.nc is a sample for 2024-09-29 00Z.

Output — each step saved as {output_dir}/{step:03d}.nc, shape (channel=85, lat=721, lon=1440) in physical units (denormalized), with a valid_time attribute. Steps are 1-based: 001.nc = +6 h, … 040.nc = +240 h (10 days).

GPU: the device is baked into the exported graph; load on CUDA. ~8 GB GPU memory is enough (model ~4 GB + recurrent state ~1.4 GB + working memory). Tested on A100, V100, RTX 3090/4090. See variables.py for the full ordered channel list.


Model overview

Model description

FuXi-2.1 is a single Transformer. The global atmospheric state is split into patches and embedded into tokens, processed by a stack of windowed-attention blocks, and read out by a variable-aware multi-head decoder. The model is deterministic — one forward pass per step, with no adversarial or diffusion sampling at inference — and is rolled out autoregressively at 6-hourly steps.

  • Developed by: Fudan University & SAIS
  • Model type: Transformer (patch-embed → Swin attention with RoPE + adaLN → multi-head decoder)
  • Forecast type: Global, deterministic, autoregressive
  • License: CC BY 4.0
  • Predecessor: FuXi-1.0
FuXi-2.1 architecture

Architecture details

Component Specification
Backbone Single Transformer trunk (no U-Net up/down-sampling)
Attention Swin windowed attention
Position encoding Rotary (RoPE, 1-D)
Normalisation / conditioning adaLN, conditioned on lead step, time-of-day, day-of-year
Feed-forward SwiGLU
Decoder Variable-aware multi-head (pressure / surface / derived)
Input frames 2 (states at t−6h and t₀)
Output State at t+6h, rolled out autoregressively

Model resolution

Model Horizontal resolution Vertical resolution [pressure levels] (hPa)
FuXi-2.1 0.25° (721×1440) 13: 50, 100, 150, 200, 250, 300, 400, 500, 600, 700, 850, 925, 1000

Data details

Training data

FuXi-2.1 is trained and evaluated on ERA5 reanalysis at 0.25° resolution, 6-hourly.

  • Training period: 2002–2023
  • Test period: 2024 (held out)

Data parameters

FuXi-2.1 operates on 85 channels per time step: 65 pressure-level channels (5 variables × 13 levels) and 20 surface channels, plus static forcings supplied as constant inputs. Most channels are prognostic — the same channels are input and output and fed back during roll-out. Radiation fluxes and total precipitation are diagnostic outputs produced through a dedicated decoder head (they are predicted but not fed back as inputs).

Channel order (exact): the 65 pressure-level channels first (z, then t, u, v, q, each over the 13 levels 50→1000 hPa), followed by the 20 surface channels: msl, t2m, d2m, sst, ws10m, ws100m, u10m, v10m, u100m, v100m, lcc, mcc, hcc, tcc, ssr, ssrd, fdir, ttr, tcw, tp.

Pressure-level parameters (13 levels: 50–1000 hPa)

Short name Name Units Input/Output
z Geopotential m²·s⁻² Both (prognostic)
t Temperature K Both (prognostic)
u Eastward wind m·s⁻¹ Both (prognostic)
v Northward wind m·s⁻¹ Both (prognostic)
q Specific humidity kg·kg⁻¹ Both (prognostic)

Surface parameters (20)

Short name Name Units Input/Output
msl Mean sea-level pressure Pa Both (prognostic)
t2m 2 m temperature K Both (prognostic)
d2m 2 m dewpoint temperature K Both (prognostic)
sst Sea-surface temperature K Both (prognostic)
ws10m 10 m wind speed m·s⁻¹ Both (prognostic)
ws100m 100 m wind speed m·s⁻¹ Both (prognostic)
u10m 10 m eastward wind m·s⁻¹ Both (prognostic)
v10m 10 m northward wind m·s⁻¹ Both (prognostic)
u100m 100 m eastward wind m·s⁻¹ Both (prognostic)
v100m 100 m northward wind m·s⁻¹ Both (prognostic)
lcc Low cloud cover 0–1 Both (prognostic)
mcc Medium cloud cover 0–1 Both (prognostic)
hcc High cloud cover 0–1 Both (prognostic)
tcc Total cloud cover 0–1 Both (prognostic)
tcw Total column water kg·m⁻² Both (prognostic)
ssr Surface net solar radiation J·m⁻² Output (diagnostic)
ssrd Surface solar radiation downwards J·m⁻² Output (diagnostic)
fdir Total-sky direct solar radiation at surface J·m⁻² Output (diagnostic)
ttr Top net thermal radiation J·m⁻² Output (diagnostic)
tp Total precipitation mm Output (diagnostic)
Field Level type Input/Output
Land-sea mask, orography/geopotential, latitude/longitude encodings, time-of-day / day-of-year Surface / static Input (forcings)

Evaluation

We compare FuXi-2.1 against FuXi-1.0 under an identical protocol: forecasts initialised from ERA5 and rolled out to 240 h in 6-hour steps. CSI is computed over land only, globally. These numbers come from a limited set of sample cases, not a full-year evaluation — they are indicative, and broader scorecards will follow.

Headline: RMSE stays comparable to FuXi-1.0 across variables, while structural and extreme-event scores improve substantially.

Precipitation CSI Wind-speed CSI

Precipitation — Critical Success Index (CSI)

Threshold FuXi-1.0 FuXi-2.1 Δ
≥ 5 mm 0.265 0.284 +7.3%
≥ 20 mm 0.131 0.146 +11.4%
≥ 50 mm 0.074 0.084 +13.4%
≥ 100 mm 0.014 0.024 +68.3%

10 m wind speed — Critical Success Index (CSI)

Threshold FuXi-1.0 FuXi-2.1 Δ
≥ 10.8 m·s⁻¹ 0.544 0.571 +4.8%
≥ 24.5 m·s⁻¹ 0.165 0.198 +20.3%
≥ 28.5 m·s⁻¹ 0.000 0.044 newly resolved

The relative gain grows with event intensity, peaking at the extreme tail. At the 28.5 m·s⁻¹ wind threshold FuXi-1.0 scores zero — it never predicts such winds — whereas FuXi-2.1 attains a non-zero CSI. Spatial power spectra of FuXi-2.1 track the observed spectra across the full wavenumber range, in contrast to FuXi-1.0's high-wavenumber energy deficit.


Known limitations

  • FuXi-2.1 is a deterministic model; it does not provide a calibrated ensemble spread.
  • The CSI numbers reported here are computed on land only, over a limited set of sample cases rather than a full-year evaluation; treat them as indicative. Comprehensive global scorecards will be added.
  • As with all ERA5-trained models, skill depends on the quality and resolution of the initial conditions.

Citation

If you use FuXi-2.1, please cite the FuXi series:

@article{chen2023fuxi,
  title   = {FuXi: a cascade machine learning forecasting system for 15-day global weather forecast},
  author  = {Chen, Lei and Zhong, Xiaohui and Zhang, Feng and Cheng, Yuan and Xu, Yimin and Qi, Yan and Li, Hao},
  journal = {npj Climate and Atmospheric Science},
  year    = {2023},
  volume  = {6},
  number  = {1},
  pages   = {190}
}

Code: FuXi-1.0 — https://github.com/tpys/FuXi

© 2026 Fudan University & SAIS · FuXi Weather.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including tpys/fuxi-2.1