Queries regarding training

#20
by rakshith25 - opened

Thanks for open-sourcing this great work. Kudos to the team!
On going through the documentation available and trying to use the model for certain applications there were certain things that I couldn't figure out. It would be great if anyone from the team or the community helped me answer these.

  1. Size of the training dataset in terms of the number of images used for MAE pre-training. Is this a curated version of HLS Landsat? or were all the images available used?
  2. What is the time period/interval between time stamps provided to the model during training? What is the reasoning behind this choice and does the choice of the time stamp (during inference) impact the performance?
  3. Was the MAE masking percentage 75%? and for how many epochs were they trained?
  4. Why [B, C, T, H, W] and not [B, T*C, H, W]?
  5. In order to train the model, were all the tiles from Continental US used or specific regions within CONUS?
  6. Quality of MAE reconstruction in terms of SSIM? I did try using the model for certain patches in the US and the reconstruction quality was around 0.8 ish with patching artefacts.

Hope to get some insights on these. Thanks in advance

I am also interested in these queries and would like to know the insights as well. Great work on the model and kudos!

IBM NASA Geospatial org

While not involved in the pre-training personally, I can say that a curated version of HLS was used for pre-training. Images were randomly selected across climate zones of the United States to ensure an even sampling of different types of scenes. We've recently published an article on ArXiv that goes into the pre-training in more detail.

https://arxiv.org/pdf/2310.18660.pdf

how to use this model in locally by downloading it

Sign up or log in to comment