Graph Machine Learning
AnemoI
English

AIFS ENS v2.0 failing with "Complete ERA5 global atmospheric reanalysis" dataset

#1
by iji000 - opened

I am trying to run AIFS ENS 2.0 with the following configuration fie:

checkpoint:
huggingface: "ecmwf/aifs-ens-2.0"

date: 1996-08-23T06:00:00

lead_time: 426

input:
cds:
dataset: 'reanalysis-era5-complete'

output:
netcdf: /panfs/jahani/AIFS_runs/TCs_SN_list/whole_run/aifs_ens_v2.0/aifs_ens_v2_0_fran_6hr_bfr.nc

I was able to run AIFS ENS v1.0 with this configuration successfully. However, with theh new version, I am getting the following error:

"The job has failed
MARS returned no data, please check your selection.Request submitted to the MARS server:
[{'area': ['90.0', '0.0', '-90.0', '360.0'], 'class': ['ea'], 'database': 'fdbmarser', 'date': ['1996-08-23'], 'domain': ['g'], 'expect': ['any'], 'expver': ['1'], 'grid': ['N320'], 'levtype': ['sfc'], 'number': ['all'], 'param': ['10u', '10v', '2d', '2t', 'msl', 'sd', 'skt', 'sp', 'stl1', 'stl2', 'swvl1', 'swvl2', 'tcw'], 'step': ['0'], 'stream': ['enfo'], 'time': ['0000', '0600'], 'type': ['an']}]"

I am not sure sure why the model is trying to fetch data from stream 'enfo' for "reanalysis-era5-complete". When I ran AIFS single v1.1 and AIFS ENS v1.0 with the same config file, stream was "oper" for "reanalysis-era5-complete" which is the complete ERA5 global atmospheric reanalysis data in N320 grid.

I would greatly appreciate any thought on this.

Hi,
Thanks for your quick interest in the model.
I believe the issue to be slightly improved metadata in the checkpoint, which is providing a default stream. If you set the stream in your input config to oper, it will resolve correctly.

use_grib_paramid: false

input:
  cds:
    dataset: 'reanalysis-era5-complete'
    stream: oper

However, AIFS-ENS-2.0 also included wave fields, as discussed in the model card and in this blog which are not included in ERA5. The wave hindcasts used for training are slowly being made open, with this one available right now https://apps.ecmwf.int/ifs-experiments/rd/ix7j/.
To initialise the AIFS, you will have to download the parts you can from the CDS, and stitch it together with these wave hindcasts.
One other thing to consider, the AIFS was finetuned on IFS, so will experience degredation when starting from ERA5, so please be mindful of this when scoring the model.

This comment has been hidden (marked as Off-Topic)
iji000 changed discussion status to closed
iji000 changed discussion status to open

Thank you for your reply. With ensemble version 1.0, I didn't need to download the input data. Based on the config file, during inference the model would automatically grab data from 'reanalysis-era5-complete' . I could see the input data I requested getting queued in my CDS account but I didn't have to download it in my computer/HPC. Does this functionality change in the new version, meaning, that now I have to download the input data first and then point to the downloaded file path in the config file?

The settings you suggested were able to solve the "oper" stream issue but it also tries to grab the wave variables from 'reanalysis-era5-complete' and give the following error.

MARS has returned an error, please check your selection.
Request submitted to the MARS server:
[{'area': ['90.0', '0.0', '-90.0', '360.0'], 'class': ['ea'], 'database': 'fdbmarser', 'date': ['2024-09-23'], 'domain': ['g'], 'expect': ['any'], 'expver': ['1'], 'grid': ['N320'], 'levtype': ['sfc'], 'number': ['all'], 'param': ['cdww', 'cos_mwd', 'h1012', 'h1214', 'h1417', 'h1721', 'h2125', 'h2530', 'mwp', 'sin_mwd', 'swh'], 'step': ['0'], 'stream': ['oper'], 'time': ['0000', '0600'], 'type': ['an']}]
Full error message:
mars - ERROR - 20260519.163345 - Ambiguous : cos_mwd could be WILDFIRE FLUX OF CARBON MONOXIDE or CARBON MONOXIDE
mars - ERROR - 20260519.163345 - Ambiguous : sin_mwd could be SIGNIFICANT WAVE HEIGHT OF AT LEAST 8 M or SIGNIFICANT WAVE HEIGHT PROBABILITY
The job failed with: MarsRuntimeError

Following is my full config file. Can you please tell me what changes do I need to make for the wave variables?

checkpoint:
  huggingface: "ecmwf/aifs-ens-2.0"


# Choose input data (6 hours before the first forecast hour)
date: 2024-09-23T06:00:00

lead_time: 138

use_grib_paramid: false

input:
  cds:
    dataset: 'reanalysis-era5-complete'
    stream: oper


output:
  netcdf: /panfs/jahani/AIFS_runs/TCs_SN_list/whole_run/aifs_ens_v2.0/aifs_ens_v2_0_helene_run1_6hr_bfr.nc

I really appreciate your help!

However, AIFS-ENS-2.0 also includes wave fields, as discussed in the model card and in this blog which are not included in ERA5.

ERA5 does not include the wave fields used in AIFS v2.0. You will not be able to rely on the automatic download to work nicely here, and have to download and stitch together the fields.

Thanks for the clarification. I am interested in simulating historical tropical cyclone events with the new version of AIFS. I have two follow up questions:

  1. The wave hindcasts in this link https://apps.ecmwf.int/ifs-experiments/rd/ix7j/ appear to be from 1980-1989. My understanding is: with these wave hindcasts, the model cannot be initialized to simulate an event that is outside of 1980-1989 . Is that correct?
  2. If I want to run the model to simulate events beyond 1989, which dataset can be used to download the wave variables for initialization? As far as I know, IFS also doesn't provide wave variables.

Thank you. I appreciated your help!

ECMWF org

We are slowly making more of the wave hindcasts available, you can search on this page for CY50R1 wave hindcast. At the moment only the one is available, yet more will be made available soon.

Sign up or log in to comment