AGarioud commited on
Commit
49925a8
1 Parent(s): 196a58b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -67
README.md CHANGED
@@ -17,19 +17,19 @@ model-index:
17
  metrics:
18
  - name: mIoU
19
  type: mIoU
20
- value: 54.7168
21
  - name: Overall Accuracy
22
  type: OA
23
  value: 76.3711
24
  - name: Fscore
25
  type: Fscore
26
- value: 67.6063
27
  - name: Precision
28
  type: Precision
29
- value: 69.3481
30
  - name: Recall
31
  type: Recall
32
- value: 67.6565
33
 
34
  - name: IoU Buildings
35
  type: IoU
@@ -78,6 +78,7 @@ pipeline_tag: image-segmentation
78
  ---
79
 
80
  # FLAIR model collection
 
81
  The FLAIR models are a collection of semantic segmentation models initially developed to classify land cover on very high resolution aerial images (more specifically the French [BD ORTHO®](https://geoservices.ign.fr/bdortho) product).
82
  The distributed pre-trained models differ in their :
83
  * dataset for training : [**FLAIR** dataset] (https://huggingface.co/datasets/IGNF/FLAIR) or the increased version of this dataset **FLAIR-INC** (x 3.5 patches). Only the FLAIR dataset is open at the moment.
@@ -98,10 +99,8 @@ The general characteristics of this specific model **FLAIR-INC_RVBIE_resnet34_un
98
 
99
  ## Model Informations
100
 
101
- <!-- Provide the basic links for the model. -->
102
-
103
- - **Repository:** https://github.com/IGNF/FLAIR-1-AI-Challenge
104
- - **Paper [optional]:** https://arxiv.org/pdf/2211.12979.pdf
105
  - **Developed by:** IGN
106
  - **Compute infrastructure:**
107
  - software: python, pytorch-lightning
@@ -110,18 +109,17 @@ The general characteristics of this specific model **FLAIR-INC_RVBIE_resnet34_un
110
 
111
 
112
  ## Uses
113
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
114
 
115
  Although the model can be applied to other type of very high spatial earth observation images, it was initially developed to tackle the problem of classifying aerial images acquired on the French Territory.
116
  The product called ([BD ORTHO®](https://geoservices.ign.fr/bdortho)) has its own spatial and radiometric specifications. The model is not intended to be generic to other type of very high spatial resolution images but specific to BD ORTHO images.
117
- As a result, the prediction produced by the model would be all the better as the user images are similar to the original ones.
118
 
119
  _**Radiometry of input images**_ :
120
- The BD ORTHO input images are distributed in 8-bit encoding format per channel. When traning the model, input normalization was performed (see section **Traing Details**).
121
  It is recommended that the user apply the same type of input normalization while inferring the model.
122
 
123
  _**Multi-domain model**_ :
124
- The FLAIR-INC dataset that was used for training is composed of 75 radiometric domains. In the case of aerial images, domain shifts are frequent and are mainly due to : the date of acquisition of the aerial survey (april to november), the spatial domain (equivalent to a french department administrative division) and downstream radimetric processing.
125
  By construction (sampling 75 domains) the model is robust to these shifts, and can be applied to any images of the ([BD ORTHO® product](https://geoservices.ign.fr/bdortho)).
126
 
127
  _**Specification for the Elevation channel**_ :
@@ -130,18 +128,16 @@ When decoded to [0,255] ints, a difference of 1 should coresponds to 0.2 meters
130
 
131
 
132
  _**Land Cover classes of prediction**_ :
133
- The orginial class nomenclature of the FLAIR Dataset is made up of 19 classes (See the [FLAIR dataset](https://huggingface.co/datasets/IGNF/FLAIR) page for details).
134
- However 3 classes corresponding to uncertain labelisation (Mixed (16), Ligneous (17) and Other (19)) and 1 class with very poor labelling (Clear cut (15)) were deasctivated during training.
135
- As a result, the logits produced by the model are of size 19x1, but classes n° 15, 16, 17 and 19 should appear at 0 in the logits and should never predicted in the Argmax.
136
 
137
 
138
- <!-- ## Bias, Risks, and Limitations -->
139
- ## Bias, Risks, Limitations and Recommendations
140
 
141
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
142
 
143
  _**Using the model on input images with other spatial resolution**_ :
144
- The FLAIR-INC_RVBIE_resnet34_unet_15cl_norm model was trained with fixed scale conditions. All patches used for training are derived from aerial images of 0.2 meters spatial resolution. Only flip and rotate augmentation were performed during the training process.
145
  No data augmentation method concerning scale change was used during training. The user should pay attention that generalization issues can occur while applying this model to images that have different spatial resolutions.
146
 
147
  _**Using the model for other remote sensing sensors**_ :
@@ -153,11 +149,6 @@ The FLAIR-INC_RVBIE_resnet34_unet_15cl_norm model was trained on patches reprens
153
  The user should be aware that applying the model to other type of landscapes may imply a drop in model metrics.
154
 
155
 
156
- <!--{{ bias_risks_limitations | default("[More Information Needed]", true)}}-->
157
- <!--### Recommendations-->
158
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
159
- <!--{{ bias_recommendations | default("Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.", true)}}-->
160
-
161
 
162
  ## How to Get Started with the Model
163
 
@@ -171,26 +162,23 @@ Use the code below to get started with the model.
171
 
172
  ### Training Data
173
 
174
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
175
  218 400 patches of 512 x 512 pixels were used to train the **FLAIR-INC_RVBIE_resnet34_unet_15cl_norm** model.
176
  The train/validation split was performed patchwise to obtain a 80% / 20% distribution between train and validation.
177
  Annotation was performed at the _zone_ level (~100 patches per _zone_). Spatial independancy between patches is guaranted as patches from the same _zone_ were assigned to the same set (TRAIN or VALIDATION).
178
- Here are the number of patches used for train and validation :
179
  | TRAIN set | 174 700 patches |
180
  | VALIDATION set | 43 700 patchs |
181
 
182
- <!--{{ training_data | default("[More Information Needed]", true)}} -->
183
 
184
  ### Training Procedure
185
 
186
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
187
 
188
  #### Preprocessing [optional]
189
 
190
- For traning the model, input normalization was performed so as the input dataset has **a mean=0** and a **standard deviation = 1** channel wise.
191
  We used the statistics of TRAIN+VALIDATION for input normalization. It is recommended that the user apply the same type of input normalization.
192
 
193
- Here are the statistics of the TRAIN+VALIDATIOn set :
194
 
195
  | Modalities | Mean (Train + Validation) |Std (Train + Validation) |
196
  | ----------------------- | ----------- |----------- |
@@ -201,7 +189,7 @@ Here are the statistics of the TRAIN+VALIDATIOn set :
201
  | Elevation Channel (E) | 53.26 |79.30 |
202
 
203
 
204
- <!--{{ preprocessing | default("[More Information Needed]", true)}} -->
205
 
206
 
207
  #### Training Hyperparameters
@@ -229,39 +217,37 @@ Here are the statistics of the TRAIN+VALIDATIOn set :
229
 
230
  #### Speeds, Sizes, Times [optional]
231
 
232
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
233
 
234
  The FLAIR-INC_RVBIE_resnet34_unet_15cl_norm model was trained on a HPC/AI resources provided by GENCI-IDRIS (Grant 2022-A0131013803).
235
- 16 V100 GPUs were requested ( 4 nodes, 4 GPUS per node). With this configuration the approximate learning time is 6 minutes per epoch.
236
 
237
  FLAIR-INC_RVBIE_resnet34_unet_15cl_norm was obtained for num_epoch=76 with corresponding val_loss=0.56.
238
 
239
- <!-- <img src="train_loss_FLAIR-INC_RGBIE_resnet34_unet_15cl_norm.png" alt="drawing" style="width:200px;"/>-->
240
- <!-- ![](train_loss_FLAIR-INC_RGBIE_resnet34_unet_15cl_norm.png)| ![](val_loss_FLAIR-INC_RGBIE_resnet34_unet_15cl_norm.png)-->
241
-
242
- | <div style="width:290px">TRAIN loss</div> |<div style="width:290px">VALIDATION loss</div> |
243
- | --------------------------------------- | ------------------------------------- |
244
- | `<img src="train_loss_FLAIR-INC_RGBIE_resnet34_unet_15cl_norm.png" alt="drawing" style="width:300px;"/> |`<img src="val_loss_FLAIR-INC_RGBIE_resnet34_unet_15cl_norm.png" alt="drawing" style="width:300px;"/> |
245
 
 
 
 
 
 
 
 
 
 
 
246
 
247
- <!-- {{ speeds_sizes_times | default("[More Information Needed]", true)}}-->
248
 
249
  ## Evaluation
250
 
251
- <!-- This section describes the evaluation protocols and provides the results. -->
252
-
253
  ### Testing Data, Factors & Metrics
254
 
255
  #### Testing Data
256
 
257
- <!-- This should link to a Dataset Card if possible. -->
258
  The evaluation was performed on a TEST set of 31 750 patches that are independant from the TRAIN and VALIDATION patches. They represent 15 spatio-temporal domains.
259
  The TEST set corresponds to the reunion of the TEST set of scientific challenges FLAIR#1 and FLAIR#2. See the [FLAIR challenge page](https://ignf.github.io/FLAIR/) for more details.
260
 
261
  The choice of a separate TEST set instead of cross validation was made to be coherent with the FLAIR challenges.
262
  However the metrics for the Challenge were calculated on 12 classes and the TEST set acordingly.
263
- As a result the _Snow_ class is unfortunately absent from the TEST set.
264
- <!-- {{ testing_data | default("[More Information Needed]", true)}} -->
265
 
266
  #### Metrics
267
 
@@ -288,47 +274,41 @@ The following table give the class-wise metrics :
288
  | _snow_ | _00.00_ | _00.00_ | _00.00_ | _00.00_ |
289
  | greenhouse | 39.45 | 56.57 | 45.52 | 74.72 |
290
  | ----------------------- | ----------|---------|---------|---------|
291
- | **average** | **58.63** | **72.44** | **74.3** | **72.49** |
292
 
293
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
294
 
295
- {{ testing_metrics | default("[More Information Needed]", true)}}
296
 
297
- The following illustration give the confusion matrix :
298
  * Left : normalised acording to columns, columns sum at 100% and the **precision** is on the diagonal of the matrix
299
  * Right : normalised acording to rows, rows sum at 100% and the **recall** is on the diagonal of the matrix
300
 
301
 
302
- | <div style="width:290px">Normalised confusion Matrix (precision)</div> |<div style="width:290px">Normalised Confusion Matrix (recall)</div> |
303
- | --------------------------------------- | ------------------------------------- |
304
- | `<img src="FLAIR-INC_RVBIE_resnet34_unet_15cl_norm_cm-precision.png" alt="drawing" style="width:300px;"/> |`<img src="FLAIR-INC_RVBIE_resnet34_unet_15cl_norm_cm-recall.png" alt="drawing" style="width:300px;"/> |
305
-
306
-
307
- ### Results
 
 
 
 
308
 
309
- <!-- Gio : Add inferenvce Sample ??? -->
310
 
311
- {{ results | default("[More Information Needed]", true)}}
312
 
313
- #### Summary
314
 
315
- {{ results_summary | default("", true) }}
316
 
 
317
 
318
- ## Technical Specifications [optional]
319
-
320
- ### Model Architecture and Objective
321
-
322
- {{ model_specs | default("[More Information Needed]", true)}}
323
 
324
 
325
  ## Citation [optional]
326
 
327
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
328
 
329
  **BibTeX:**
330
 
331
- @misc{garioud2023flair,
332
  title={FLAIR: a Country-Scale Land Cover Semantic Segmentation Dataset From Multi-Source Optical Imagery},
333
  author={Anatol Garioud and Nicolas Gonthier and Loic Landrieu and Apolline De Wit and Marion Valette and Marc Poupée and Sébastien Giordano and Boris Wattrelos},
334
  year={2023},
@@ -337,7 +317,6 @@ The following illustration give the confusion matrix :
337
  primaryClass={cs.CV}
338
  }
339
 
340
- {{ citation_bibtex | default("[More Information Needed]", true)}}
341
 
342
  **APA:**
343
  Garioud, A., Gonthier, N., Landrieu, L., De Wit, A., Valette, M., Poupée, M., ... & Wattrelos, B. (2023). FLAIR: a Country-Scale Land Cover Semantic Segmentation Dataset From Multi-Source Optical Imagery. arXiv preprint arXiv:2310.13336.
 
17
  metrics:
18
  - name: mIoU
19
  type: mIoU
20
+ value: 58.63
21
  - name: Overall Accuracy
22
  type: OA
23
  value: 76.3711
24
  - name: Fscore
25
  type: Fscore
26
+ value: 72.4353
27
  - name: Precision
28
  type: Precision
29
+ value: 74.3015
30
  - name: Recall
31
  type: Recall
32
+ value: 72.4891
33
 
34
  - name: IoU Buildings
35
  type: IoU
 
78
  ---
79
 
80
  # FLAIR model collection
81
+
82
  The FLAIR models are a collection of semantic segmentation models initially developed to classify land cover on very high resolution aerial images (more specifically the French [BD ORTHO®](https://geoservices.ign.fr/bdortho) product).
83
  The distributed pre-trained models differ in their :
84
  * dataset for training : [**FLAIR** dataset] (https://huggingface.co/datasets/IGNF/FLAIR) or the increased version of this dataset **FLAIR-INC** (x 3.5 patches). Only the FLAIR dataset is open at the moment.
 
99
 
100
  ## Model Informations
101
 
102
+ - **Code repository:** https://github.com/IGNF/FLAIR-1-AI-Challenge
103
+ - **Paper:** https://arxiv.org/pdf/2211.12979.pdf
 
 
104
  - **Developed by:** IGN
105
  - **Compute infrastructure:**
106
  - software: python, pytorch-lightning
 
109
 
110
 
111
  ## Uses
 
112
 
113
  Although the model can be applied to other type of very high spatial earth observation images, it was initially developed to tackle the problem of classifying aerial images acquired on the French Territory.
114
  The product called ([BD ORTHO®](https://geoservices.ign.fr/bdortho)) has its own spatial and radiometric specifications. The model is not intended to be generic to other type of very high spatial resolution images but specific to BD ORTHO images.
115
+ Consequently, the model’s prediction would improve if the user images are similar to the original ones.
116
 
117
  _**Radiometry of input images**_ :
118
+ The BD ORTHO input images are distributed in 8-bit encoding format per channel. When traning the model, input normalization was performed (see section **Trainingg Details**).
119
  It is recommended that the user apply the same type of input normalization while inferring the model.
120
 
121
  _**Multi-domain model**_ :
122
+ The FLAIR-INC dataset that was used for training is composed of 75 radiometric domains. In the case of aerial images, domain shifts are frequent and are mainly due to : the date of acquisition of the aerial survey (from april to november), the spatial domain (equivalent to a french department administrative division) and downstream radiometric processing.
123
  By construction (sampling 75 domains) the model is robust to these shifts, and can be applied to any images of the ([BD ORTHO® product](https://geoservices.ign.fr/bdortho)).
124
 
125
  _**Specification for the Elevation channel**_ :
 
128
 
129
 
130
  _**Land Cover classes of prediction**_ :
131
+ The orginial class nomenclature of the FLAIR Dataset encompasses 19 classes (See the [FLAIR dataset](https://huggingface.co/datasets/IGNF/FLAIR) page for details).
132
+ However 3 classes corresponding to uncertain labelisation (Mixed (16), Ligneous (17) and Other (19)) and 1 class with very poor labelling (Clear cut (15)) were desactivated during training.
133
+ As a result, the logits produced by the model are of size 19x1, but classes n° 15, 16, 17 and 19 should appear at 0 in the logits and should not be present in the final argmax product.
134
 
135
 
 
 
136
 
137
+ ## Bias, Risks, Limitations and Recommendations
138
 
139
  _**Using the model on input images with other spatial resolution**_ :
140
+ The FLAIR-INC_RVBIE_resnet34_unet_15cl_norm model was trained with fixed scale conditions. All patches used for training are derived from aerial images with 0.2 meters spatial resolution. Only flip and rotate augmentations were performed during the training process.
141
  No data augmentation method concerning scale change was used during training. The user should pay attention that generalization issues can occur while applying this model to images that have different spatial resolutions.
142
 
143
  _**Using the model for other remote sensing sensors**_ :
 
149
  The user should be aware that applying the model to other type of landscapes may imply a drop in model metrics.
150
 
151
 
 
 
 
 
 
152
 
153
  ## How to Get Started with the Model
154
 
 
162
 
163
  ### Training Data
164
 
 
165
  218 400 patches of 512 x 512 pixels were used to train the **FLAIR-INC_RVBIE_resnet34_unet_15cl_norm** model.
166
  The train/validation split was performed patchwise to obtain a 80% / 20% distribution between train and validation.
167
  Annotation was performed at the _zone_ level (~100 patches per _zone_). Spatial independancy between patches is guaranted as patches from the same _zone_ were assigned to the same set (TRAIN or VALIDATION).
168
+ The following number of patches were used for train and validation :
169
  | TRAIN set | 174 700 patches |
170
  | VALIDATION set | 43 700 patchs |
171
 
 
172
 
173
  ### Training Procedure
174
 
 
175
 
176
  #### Preprocessing [optional]
177
 
178
+ For traning the model, input normalization was performed to center-reduce (**a mean=0** and a **standard deviation = 1**, channel wise) the dataset.
179
  We used the statistics of TRAIN+VALIDATION for input normalization. It is recommended that the user apply the same type of input normalization.
180
 
181
+ Statistics of the TRAIN+VALIDATION set :
182
 
183
  | Modalities | Mean (Train + Validation) |Std (Train + Validation) |
184
  | ----------------------- | ----------- |----------- |
 
189
  | Elevation Channel (E) | 53.26 |79.30 |
190
 
191
 
192
+
193
 
194
 
195
  #### Training Hyperparameters
 
217
 
218
  #### Speeds, Sizes, Times [optional]
219
 
 
220
 
221
  The FLAIR-INC_RVBIE_resnet34_unet_15cl_norm model was trained on a HPC/AI resources provided by GENCI-IDRIS (Grant 2022-A0131013803).
222
+ 16 V100 GPUs were used ( 4 nodes, 4 GPUS per node). With this configuration the approximate learning time is 6 minutes per epoch.
223
 
224
  FLAIR-INC_RVBIE_resnet34_unet_15cl_norm was obtained for num_epoch=76 with corresponding val_loss=0.56.
225
 
 
 
 
 
 
 
226
 
227
+ <div style="display: flex; justify-content: space-between; width: 50%">
228
+ <div style="width: 45%;">
229
+ <p>TRAIN loss</p>
230
+ <img src="train_loss_FLAIR-INC_RGBIE_resnet34_unet_15cl_norm.png" alt="drawing" style="width: 100%;"/>
231
+ </div>
232
+ <div style="width: 45%;">
233
+ <p>VALIDATION loss</p>
234
+ <img src="val_loss_FLAIR-INC_RGBIE_resnet34_unet_15cl_norm.png" alt="drawing" style="width: 100%;"/>
235
+ </div>
236
+ </div>
237
 
 
238
 
239
  ## Evaluation
240
 
 
 
241
  ### Testing Data, Factors & Metrics
242
 
243
  #### Testing Data
244
 
 
245
  The evaluation was performed on a TEST set of 31 750 patches that are independant from the TRAIN and VALIDATION patches. They represent 15 spatio-temporal domains.
246
  The TEST set corresponds to the reunion of the TEST set of scientific challenges FLAIR#1 and FLAIR#2. See the [FLAIR challenge page](https://ignf.github.io/FLAIR/) for more details.
247
 
248
  The choice of a separate TEST set instead of cross validation was made to be coherent with the FLAIR challenges.
249
  However the metrics for the Challenge were calculated on 12 classes and the TEST set acordingly.
250
+ As a result the _Snow_ class is absent from the TEST set.
 
251
 
252
  #### Metrics
253
 
 
274
  | _snow_ | _00.00_ | _00.00_ | _00.00_ | _00.00_ |
275
  | greenhouse | 39.45 | 56.57 | 45.52 | 74.72 |
276
  | ----------------------- | ----------|---------|---------|---------|
277
+ | **average** | **58.63** | **72.44** | **74.3** | **72.49** |
278
 
 
279
 
 
280
 
281
+ The following illustration gives the resulting confusion matrix :
282
  * Left : normalised acording to columns, columns sum at 100% and the **precision** is on the diagonal of the matrix
283
  * Right : normalised acording to rows, rows sum at 100% and the **recall** is on the diagonal of the matrix
284
 
285
 
286
+ <div style="display: flex; justify-content: space-between;">
287
+ <div style="width: 45%;">
288
+ <p>Normalised confusion Matrix (precision)</p>
289
+ <img src="FLAIR-INC_RVBIE_resnet34_unet_15cl_norm_cm-precision.png" alt="drawing" style="width: 100%;"/>
290
+ </div>
291
+ <div style="width: 45%;">
292
+ <p>Normalised Confusion Matrix (recall)</p>
293
+ <img src="FLAIR-INC_RVBIE_resnet34_unet_15cl_norm_cm-recall.png" alt="drawing" style="width: 100%;"/>
294
+ </div>
295
+ </div>
296
 
 
297
 
 
298
 
 
299
 
 
300
 
301
+ ### Results
302
 
303
+ Samples of results
 
 
 
 
304
 
305
 
306
  ## Citation [optional]
307
 
 
308
 
309
  **BibTeX:**
310
 
311
+ @inproceeding{garioud2023flair,
312
  title={FLAIR: a Country-Scale Land Cover Semantic Segmentation Dataset From Multi-Source Optical Imagery},
313
  author={Anatol Garioud and Nicolas Gonthier and Loic Landrieu and Apolline De Wit and Marion Valette and Marc Poupée and Sébastien Giordano and Boris Wattrelos},
314
  year={2023},
 
317
  primaryClass={cs.CV}
318
  }
319
 
 
320
 
321
  **APA:**
322
  Garioud, A., Gonthier, N., Landrieu, L., De Wit, A., Valette, M., Poupée, M., ... & Wattrelos, B. (2023). FLAIR: a Country-Scale Land Cover Semantic Segmentation Dataset From Multi-Source Optical Imagery. arXiv preprint arXiv:2310.13336.