Paolo-Fraccaro commited on
Commit
3557d80
1 Parent(s): 4076a1a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -1
README.md CHANGED
@@ -4,4 +4,30 @@ tags:
4
  - Pytorch
5
  - Geospatial
6
  - Temporal ViT
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - Pytorch
5
  - Geospatial
6
  - Temporal ViT
7
+ ---
8
+
9
+ This repository includes the foundation model architecture of Prithvi, a first-of-its-kind temporal Vision transformer pretrained by the IBM and NASA team on continental US Harmonised Landsat Sentinel 2 (HLS) data. This is contained in the `hls-gfm` folder, alongside all the relevant info on how to obtain the pre-trained weights through Hugging Face.
10
+ This repo also contains a practical implementation of finetuning Prithvi to flood detection and fire scars detection as an example of a specific downstream application. See the `fine-tuning-example` folder for more details.
11
+
12
+
13
+ ### Model and Input
14
+ The model expects remote sensing data in a video format (B, C, T, H, W). Note that the temporal dimension is very important here and not present in most
15
+ other works around remote sensing modeling. Being able to handle a time series of remote sensing images can be very helpful to a variety of downstream tasks. The model can also handle static image which can be simply fed into the model with T=1.
16
+
17
+ ### Code
18
+ The model follows [original mae repo](https://github.com/facebookresearch/mae) with modifications including:
19
+ 1. replace 2D patch embed with 3D patch embed
20
+ 2. replace 2D positional embed with 3D positional embed
21
+ 3. replace 2D patchify and unpatchify with 3D
22
+ 4. etc.
23
+
24
+ ### Pre-training
25
+ The model was pre-trained with Harmonised Landsat and Sentinel 2 data from NASA using the following bands:
26
+
27
+ * Blue
28
+ * Green
29
+ * Red
30
+ * Narrow NIR
31
+ * SWIR 1
32
+ * SWIR 2
33
+