jpxkqx
/

ddpm_mediterranean_reanalysis_tas

@@ -115,16 +115,22 @@ The data samples used for training corresponds to the period from 1981 and 2013
 All of these normalization techniques have been explored during and after [ECMWF Code 4 Earth](https://codeforearth.ecmwf.int/).
-**With monthly climatologies**. This corresponds to compute the historical climatologies during the training period for each region (pixel or domain), and normalize with respect to that.
-  - _Pixel-wise_: In this case, we use the climatology at each pixel, to standardize its values.
-  - _Domain-wise_: Here, we use the same statistics for the whole domain.
-      - In one case, this is implemented independently for both ERA5 and CERRA.
-      - In the other case, we use only the climatologies for one of the datasets to standardize both input as outputs as they are in the same magnitude.
-**Without past information**. This corresponds to normalize each sample independently by the mean and standard deviation of the ERA5 field.
-  - Use the statistics of the input ERA5, which covers a wider area than CERRA.
-  - Use the statistics of the upscaled ERA5, which represents the same area than CERRA.
 # Results

 All of these normalization techniques have been explored during and after [ECMWF Code 4 Earth](https://codeforearth.ecmwf.int/).
+**With monthly climatologies**. This corresponds to compute the historical climatologies during the training period for each region (pixel or domain), and normalize with respect to that. In our case, the monthly climatologies are considered, but it could also be disaggregated by time of day, for example.
+  - _Pixel-wise_: In this case, the climatology is computed for each pixel of the meteorological field. Then, each pixel is standardized with its own climatology statistics.
+  - _Domain-wise_: Here, the climatology statistics are computed for the whole domain of interest. After computing the statistics, 2 normalizing schemas are possible:
+      - Independent: it refers to normalizing ERA5 and CERRA independently, each with its own statistics.
+      - Dependent: it refers to use only the climatology statistics from ERA5 to standardize both ERA5 and CERRA simultaneously.
+The dependent approach is not feasible for the pixel-wise schema, because there is no direct correspondence between the input and output patch pixels. If we would be interested in doing so, there is the possibility to compute the statistics over the bicubic downsampled ERA5, and use those statistics for normalizing CERRA.
+**Without past information**. This corresponds to normalizing each sample independently by the mean and standard deviation of the ERA5 field. This is known in the ML community as instance normalization. Here, we have to use only the distribution statistics from the inputs as the outputs will not be available during inference, but 2 different variations are possible in our use case:
+  - Use the statistics of the input ERA5. Recall that it covers a wider area than CERRA.
+  - Use the statistics of the bicubic downscaled ERA5, which represents the same area as CERRA.
+The difference between these two approaches is not about calculating the statistics on the downscaled or source ERA5. The difference is that the input patch encompasses a larger area, and therefore a more different distribution. Thus, the second approach seems more correct as the downscaled area distribution will be more similar to the output distribution.
 # Results