File size: 2,564 Bytes
147304d
 
 
8e4887f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: apache-2.0
---

## Overview

[SatlasPretrain](https://satlas-pretrain.allen.ai) is a large-scale remote sensing image understanding dataset.
The models here are Swin Transformer backbones pre-trained on either the high-resolution images or the Sentinel-2 images in SatlasPretrain.

- `satlas-model-v1-highres.pth` is applicable for downstream tasks involving 0.5-2.0 m/pixel satellite or aerial imagery.
- `satlas-model-v1-lowres.pth` is applicable for downstream tasks involving [Sentinel-2 satellite images](https://sentinel.esa.int/web/sentinel/missions/sentinel-2).

The pre-trained backbones are expected to improve performance on a wide range of remote sensing and geospatial tasks, such as planetary and environmental monitoring.
They have already been deployed to develop robust models for detecting solar farms, wind turbines, offshore platforms, and tree cover in [Satlas](https://satlas.allen.ai), a platform for global geospatial data generated by AI from satellite imagery.

## Usage and Input Normalization

The backbones can be loaded for fine-tuning on downstream tasks:

    import torch
    import torchvision
    model = torchvision.models.swin_transformer.swin_v2_b()
    full_state_dict = torch.load('satlas-model-v1-highres.pth')
    # Extract just the Swin backbone parameters from the full state dict.
    swin_prefix = 'backbone.backbone.'
    swin_state_dict = {k[len(swin_prefix):]: v for k, v in full_state_dict.items() if k.startswith(swin_prefix)}
    model.load_state_dict(swin_state_dict)

The expected input is as follows:

- `satlas-model-v1-highres.pth`: inputs 8-bit RGB high-resolution images, with 0-255 RGB values normalized to 0-1 by dividing by 255.
- `satlas-model-v1-lowres.pth`: inputs the TCI image from Sentinel-2 L1C scenes, which is an 8-bit image already processed from the B04 (red), B03 (green), and B02 (blue) bands. Normalize the 0-255 RGB values to 0-1 by dividing by 255.

Please see [the SatlasPretrain github](https://github.com/allenai/satlas/blob/main/SatlasPretrain.md) for more examples and usage options.
Models that use nine Sentinel-2 bands are also available there.

## Code

The training code and SatlasPretrain dataset are at https://github.com/allenai/satlas/.

SatlasPretrain is [a paper](https://arxiv.org/abs/2211.15660) appearing at the International Conference on Computer Vision in October 2023.

## Feedback

We welcome any feedback about the model or training data.
To contact us, please [open an issue on the SatlasPretrain github](https://github.com/allenai/satlas/issues).