pereza commited on
Commit
f76e35a
1 Parent(s): fa714d8

Update README.md

Browse files

Adding processing steps to the data downloaded from the CDS

Files changed (1) hide show
  1. README.md +21 -1
README.md CHANGED
@@ -136,7 +136,27 @@ The data samples used for training corresponds to the period from 1985 and 2013
136
 
137
  ### Preprocessing
138
 
139
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
140
 
141
  ### Speeds, Sizes, Times
142
 
 
136
 
137
  ### Preprocessing
138
 
139
+ The preprocessing of climate datasets ERA5 and CERRA, extracted from the Climate Data Store (CDS), is a critical step before their utilization in training models.
140
+ This paper delineates the series of preprocessing steps undertaken to homogenize these datasets into a common format conducive for the ensuing experiment. The steps
141
+ include unit standardization, coordinate system rectification, and grid interpolation. The rationale and methodologies employed in each step are discussed comprehensively,
142
+ setting a robust foundation for the subsequent training procedure.
143
+
144
+ 1. Unit Standardization:
145
+ A preliminary step in the preprocessing pipeline involved the standardization of units across both datasets.
146
+ This was imperative to ensure a uniform unit system, facilitating a seamless integration of the datasets in later stages.
147
+ The units in both datasets were scrutinized and amended to adhere to a common unit system, thereby eliminating any discrepancies that could hinder the analysis.
148
+
149
+ 2. Coordinate System Rectification:
150
+ The coordinate system of the datasets was rectified to ensure a coherent representation of geographical information.
151
+ Specifically, the coordinates and dimensions were renamed to a standardized format with longitude (lon) and latitude (lat) as designated names.
152
+ The longitude values were adjusted to range from -180 to 180 instead of the initial 0 to 360 range, while latitude values were ordered in ascending order,
153
+ thereby aligning with conventional geographical coordinate systems.
154
+
155
+ 3. Grid Interpolation:
156
+ The ERA5 dataset is structured on a regular grid with a spatial resolution of 0.25º, whereas the CERRA dataset inhabits a curvilinear grid with a Lambert Conformal
157
+ projection of higher spatial resolution (0.05º). To overcome this disparity, a grid interpolation procedure was initiated.
158
+ This step was crucial to align the datasets onto a common regular grid (with different spatial resolution), thereby ensuring consistency in spatial representation.
159
+ The interpolation transformed the CERRA dataset to match the regular grid structure of the ERA5 dataset, thus harmonizing the spatial resolution across both datasets.
160
 
161
  ### Speeds, Sizes, Times
162