IGNF
/

FRACTAL-LidarHD_7cl_randlanet

@@ -71,15 +71,19 @@ with very-high-definition aerial images from the ([BD ORTHO®](https://geoservic
 **_Data preprocessing_**: ?? keep ?
-**_Multi-domain model_**: The FRACTAL dataset used for training covers 5 spatial domains from 5 southern regions of metropolitan France.
-The 250 km² of data in FRACTAL were sampled from an original 17440 km² area, and cover a wide diversity of landscapes and scenes.
-While large and diverse, this data only covers a fraction of the French territory, and the model should be used with adequate verifications when applied to new domains.
-This being said, while domain shifts are frequent for aerial imageries due to different acquisition conditions and downstream data processing,
-the aerial lidar point clouds are expected to have more consistent characteristiques
-(density, range of acquisition angle, etc.) across spatial domains.
 ## Bias, Risks, Limitations and Recommendations
 ---
 ## How to Get Started with the Model
@@ -94,13 +98,17 @@ For convenience and scalable model deployment, Myria3D comes with a Dockerfile.
 ## Training Details
-The data comes from the Lidar HD program, more specifically from acquisition areas that underwent automated classification followed by manual correction (so-called "optimized Lidar HD").
 It meets the quality requirements of the Lidar HD program, which accepts a controlled level of classification errors for each semantic class.
 ### Training Data
-80,000 point cloud patches of 50 x 50 meters were used to train the **FRACTAL-LidarHD_7cl_randlanet** model.
-10,000 additional patches were used for model validation.
 ### Training Procedure
@@ -110,6 +118,7 @@ Point clouds were preprocessed for training with point subsampling, filtering of
 For inference, a preprocessing as close as possible should be used. Refer to the inference configuration file, and to the Myria3D code repository (V3.8).
 #### Training Hyperparameters
 - Model architecture: RandLa-Net (implemented with the Pytorch-Geometric framework in [Myria3D](https://github.com/IGNF/myria3d/blob/main/myria3d/models/modules/pyg_randla_net.py))
 - Augmentation :
   - VerticalFlip(p=0.5)
@@ -135,19 +144,18 @@ For inference, a preprocessing as close as possible should be used. Refer to the
 - Batch size: 10 (x 6 GPUs)
 - Number of epochs : 100 (min) - 150 (max)
 - Early stopping : patience 6 and val_loss as monitor criterium
 - Optimizer : Adam
-- Schaeduler : mode = "min", factor = 0.5, patience = 20, cooldown = 5
 - Learning rate : 0.004
 #### Speeds, Sizes, Times
-The **FRACTAL-LidarHD_7cl_randlanet** model was trained on an in-house HPC cluster.
-6 V100 GPUs were used (2 nodes, 3 GPUS per node). With this configuration the approximate learning time is 30 minutes per epoch.
 The model was obtained for num_epoch=21 with corresponding val_loss=0.112.
 <div style="position: relative; text-align: center;">
-    <p style="margin: 0;">TRAIN loss</p>
     <img src="FRACTAL-LidarHD_7cl_randlanet-train_val_losses.excalidraw.png" alt="train and val losses" style="width: 60%; display: block; margin: 0 auto;"/>
 </div>

 **_Data preprocessing_**: ?? keep ?
 ## Bias, Risks, Limitations and Recommendations
+**_Spatial Generalization_**: The FRACTAL dataset used for training covers 5 spatial domains from 5 southern regions of metropolitan France,
+While large and diverse, the dataset covers only a fraction of the French territory, and are not representative of its full diversity (landscapes, hardscapes, human-made objects...).
+Adequate verifications and evaluations should be done when applied to new spatial domains.
+**_Using the model for other data sources_**: The model was trained on Lidar HD data that was colorized with very high resolution aerial images from the ORTHO HR database.
+The data sources have their specificities in terms of resolution and spectral domains. Users can expect a drop in performance with other 3D and 2D data sources.
+This being said, while domain shifts are frequent for aerial imageries due to different acquisition conditions and downstream data processing,
+aerial lidar point clouds of comparable point densities (~40 pts/m²) are expected to have more consistent geometric characteristiques across spatial domains.
 ---
 ## How to Get Started with the Model
 ## Training Details
+The data comes from the Lidar HD program, more specifically from acquisition areas that underwent automated classification followed by manual correction
+(so-called "optimized Lidar HD").
 It meets the quality requirements of the Lidar HD program, which accepts a controlled level of classification errors for each semantic class.
+The model was trained on FRACTAL, a benchmark dataset for semantic segmentation. FRACTAL contains 250 km² of data sampled from an original 17440 km² area, with
+a large diversity of landscapes and scenes.
 ### Training Data
+80,000 point cloud patches of 50 x 50 meters each (200 km²) were used to train the **FRACTAL-LidarHD_7cl_randlanet** model.
+10,000 additional patches (25 km²) were used for model validation.
 ### Training Procedure
 For inference, a preprocessing as close as possible should be used. Refer to the inference configuration file, and to the Myria3D code repository (V3.8).
 #### Training Hyperparameters
+```yaml
 - Model architecture: RandLa-Net (implemented with the Pytorch-Geometric framework in [Myria3D](https://github.com/IGNF/myria3d/blob/main/myria3d/models/modules/pyg_randla_net.py))
 - Augmentation :
   - VerticalFlip(p=0.5)
 - Batch size: 10 (x 6 GPUs)
 - Number of epochs : 100 (min) - 150 (max)
 - Early stopping : patience 6 and val_loss as monitor criterium
+- Loss: Cross-Entropy
 - Optimizer : Adam
+- Scheduler : mode = "min", factor = 0.5, patience = 20, cooldown = 5
 - Learning rate : 0.004
+```
 #### Speeds, Sizes, Times
+The **FRACTAL-LidarHD_7cl_randlanet** model was trained on an in-house HPC cluster. 6 V100 GPUs were used (2 nodes, 3 GPUS per node). With this configuration the approximate learning time is 30 minutes per epoch.
 The model was obtained for num_epoch=21 with corresponding val_loss=0.112.
 <div style="position: relative; text-align: center;">
     <img src="FRACTAL-LidarHD_7cl_randlanet-train_val_losses.excalidraw.png" alt="train and val losses" style="width: 60%; display: block; margin: 0 auto;"/>
 </div>