File size: 4,865 Bytes
f977fd5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
---
title: GEO-Bench Leaderboard
emoji: πŸ†
colorFrom: purple
colorTo: green
sdk: docker
pinned: false
---

# πŸ† GEO-Bench Leaderboard

The [GEO-Bench leaderboard](https://huggingface.co/spaces/aialliance/GEO-Bench-Leaderboard) tracks performance of geospatial foundation models on various benchmark datasets using the GEO-Bench benchmarking framework. 

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Language: Python](https://img.shields.io/badge/language-Python%203.10%2B-green?logo=python&logoColor=green)](https://www.python.org)

## 1. How to Submit New Results

### 1.1. Create New Submission Directory
Create a new folder in the `new_submission` top directory:
```bash
geobench_leaderboard/
└── new_submission/
    β”œβ”€β”€ results_and_parameters.csv
    β”œβ”€β”€ additional_info.json
```

### 1.2. Add Results and Parameters Details
Add a CSV file (`results_and_parameters.csv`) with the columns below. Please note that if terratorch-iterate is used for experiments, this table may be created automatically upon completion of an experiment. Please see the `examples/results_and_parameters.csv`  for an example.
 - `backbone`: backbone used for experiment, (e.g. Prithvi-EO-V2 600M)
 - `dataset`: some or all of the GEO-bench datasets. Please see Info page to learn more.
 - `Metric`: the type of metric used for evaluation. Depending on the dataset, this may be one of the following: `Overall_Accuracy`, `Multilabel_F1_Score`, `Multiclass_Jaccard_Index`
 - `experiment_name`: if terratorch-iterate used, this will the experiment_name used in mlflow. Otherwise, a unique name may be used for all results relating to a single backbone
 - `batch_size_selection`: denotes whether the batch size was fixed during hyperparameter optimization. May be `fixed` or `optimized`
 - `early_stop_patience`: early stopping patience using for trainer
 - `n_trials`: number of trials used for hyperparameter optimization
 - `Seed`: random seed used for repeated experiment. At least 5 random seeds must be used for each backbone
 - `batch_size`: batch size used for repeated experiments for each backbone/dataset combination.
 - `weight_decay`: weight decay experiments for each backbone/dataset combination.
 - `lr`: learning rate used for repeated experiments for each backbone/dataset combination. Obtained from hyperparameter optimization (HPO)
 - `test metric`: metric obtained from running backbone on the dataset during repeated experiment. Please see Info page to learn more. 


### 1.3. Add Additional Information
Create a JSON file (`additional_info.json`) with information about your submission and any new models that will be included.
The JSON file MUST have the same file name and contain the same keys as the `examples/additional_info.json` file. 


### 1.4. Submit PR

 - Fork the repository
 - Add your results following the structure above and in the PR comments add more details about your submission
 - Create a pull request to main


## 2. Benchmarking with Terratorch-Iterate
The [TerraTorch-Iterate](https://github.com/IBM/terratorch-iterate) library, based on [TerraTorch](https://github.com/IBM/terratorch), leverages MLFlow for experiment logging, optuna for hyperparameter optimization and ray for parallelization. It includes functionality to easily perform both hyperparameter tuning and re-repeated experiments in the manner prescribed by the GEO-Bench protocol. The `summarize` feature of `TerraTorch-Iterate` can be used to automatically create a `results_and_parameters.csv` file for submission, once benchmarking is complete.

### 2.1 Installation
Please see [TerraTorch-Iterate](https://github.com/IBM/terratorch-iterate) for installation instructions

### 2.2 Running benchmark experiments
**On existing models**:  To run experiments on an existing model, a custom config file specifying the model and dataset parameters should be prepared. To compare performance of multiple models, define a config file with unique experiment name for each model being comapred. Please see the `examples` folder for sample config files. Each config file (experiment) can then be executed with the following command:

```
terratorch iterate --hpo --repeat --config <config-file>
```

**On new models**: New models can be evaluated by first onboarding them to the [TerraTorch](https://github.com/IBM/terratorch/) library. Once onboarded, benchmarking may be conducted as outlined above.


### 2.3 Summarizing and plotting results 
**Extract results and parameters**: The command below can be used to extract results and hyperparameters file for submission to the lederboard. Please see details at the following link: https://github.com/terrastackai/iterate?tab=readme-ov-file#summarizing-results. 
```
terratorch iterate --summarize --config <summarize-config-file>
```