IGNF
/

FRACTAL-LidarHD_7cl_randlanet

@@ -69,8 +69,7 @@ The Lidar HD is an ambitious initiative that aim to obtain a 3D description of t
 While the model could be applied to other types of point clouds, [Lidar HD](https://geoservices.ign.fr/lidarhd) data have specific geometric specifications. Furthermore, the training data was colorized
 with very-high-definition aerial images from the ([BD ORTHO®](https://geoservices.ign.fr/bdortho)), which have their own spatial and radiometric specifications. Consequently, the model's prediction would improve for aerial lidar point clouds with similar densities and colorimetries than the original ones.
-**_Data preprocessing_**: Point clouds were preprocessed for training with point subsampling, filtering of artefacts points, on-the-fly creation of colorimetric features, and normalization of features and coordinates.
-For inference, the same preprocessing should be used (refer to the inference configuration and to the code repository).
 **_Multi-domain model_**: The FRACTAL dataset used for training covers 5 spatial domains from 5 southern regions of metropolitan France.
 The 250 km² of data in FRACTAL were sampled from an original 17440 km² area, and cover a wide diversity of landscapes and scenes.
@@ -95,17 +94,45 @@ For convenience and scalable model deployment, Myria3D comes with a Dockerfile.
 ## Training Details
 ### Training Data
 ### Training Procedure
 #### Preprocessing
 #### Training Hyperparameters
 #### Speeds, Sizes, Times

 While the model could be applied to other types of point clouds, [Lidar HD](https://geoservices.ign.fr/lidarhd) data have specific geometric specifications. Furthermore, the training data was colorized
 with very-high-definition aerial images from the ([BD ORTHO®](https://geoservices.ign.fr/bdortho)), which have their own spatial and radiometric specifications. Consequently, the model's prediction would improve for aerial lidar point clouds with similar densities and colorimetries than the original ones.
+**_Data preprocessing_**: ?? keep ?
 **_Multi-domain model_**: The FRACTAL dataset used for training covers 5 spatial domains from 5 southern regions of metropolitan France.
 The 250 km² of data in FRACTAL were sampled from an original 17440 km² area, and cover a wide diversity of landscapes and scenes.
 ## Training Details
+The data comes from the Lidar HD program, more specifically from acquisition areas that underwent automated classification followed by manual correction (so-called "optimized Lidar HD").
+It meets the quality requirements of the Lidar HD program, which accepts a controlled level of classification errors for each semantic class.
 ### Training Data
+80,000 point cloud patches of 50 x 50 meters were used to train the **FRACTAL-LidarHD_7cl_randlanet** model.
+10,000 additional patches were used for model validation.
 ### Training Procedure
 #### Preprocessing
+Point clouds were preprocessed for training with point subsampling, filtering of artefacts points, on-the-fly creation of colorimetric features, and normalization of features and coordinates.
+For inference, a preprocessing as close as possible should be used. Refer to the inference configuration file, and to the Myria3D code repository (V3.8).
 #### Training Hyperparameters
+* Model architecture: RandLa-Net (implemented with the Pytorch-Geometric framework in [Myria3D](https://github.com/IGNF/myria3d/blob/main/myria3d/models/modules/pyg_randla_net.py))
+* Augmentation :
+  * VerticalFlip(p=0.5)
+  * HorizontalFlip(p=0.5)
+* Features:
+  * Lidar: x, y, z, echo number (1-based numbering), number of echos, reflectance (a.k.a intensity)
+  * Colors:
+    * Original: RGB + Near-Infrared (colorization from aerial images by vertical pixel-point alignement)
+    * Derived: average color = (R+G+B)/3 and NDVI.
+* Input preprocessing:
+  * grid sampling: 0.25 m
+  * random sampling: 40,000 (if higher)
+  * horizontal normalization: mean xy substraction
+  * vertical normalization: min z substraction
+  * coordinates normalization: 25 meters division
+  * occlusion: nullify color channels if echo_number > 1
+  *
+* Batch size: 10
+* Number of epochs : 100 (min) - 150 (max)
+* Early stopping : patience 6 and val_loss as monitor criterium
+* Optimizer : Adam
+* Schaeduler : mode = "min", factor = 0.5, patience = 20, cooldown = 5
+* Learning rate : 0.004
 #### Speeds, Sizes, Times