--- language: - eng license: cc0-1.0 tags: - multilabel-image-classification - multilabel - generated_from_trainer base_model: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs model-index: - name: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs results: [] --- DinoVdrone is a fine-tuned version of [DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs](https://huggingface.co/DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs). It achieves the following results on the test set: - Loss: 0.4512 - RMSE: 0.1689 - MAE: 0.1261 - KL Divergence: 0.5558 --- # Model description DinoVdrone is a model built on top of DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs model for underwater multilabel image classification.The classification head is a combination of linear, ReLU, batch normalization, and dropout layers. The source code for training the model can be found in this [Git repository](https://github.com/SeatizenDOI/DinoVdeau). - **Developed by:** [lombardata](https://huggingface.co/lombardata), credits to [César Leblanc](https://huggingface.co/CesarLeblanc) and [Victor Illien](https://huggingface.co/groderg) --- # Intended uses & limitations You can use the raw model for classify diverse marine species, encompassing coral morphotypes classes taken from the Global Coral Reef Monitoring Network (GCRMN), habitats classes and seagrass species. --- # Training and evaluation data Details on the estimated number of images for each class are given in the following table: | Class | train | test | val | Total | |:------------------------|--------:|-------:|------:|--------:| | Acropore_branched | 575 | 108 | 99 | 782 | | Acropore_digitised | 558 | 122 | 107 | 787 | | Acropore_tabular | 325 | 119 | 108 | 552 | | Algae | 2354 | 778 | 776 | 3908 | | Atra/Leucospilota | 417 | 79 | 58 | 554 | | Dead_coral | 1778 | 485 | 503 | 2766 | | Fish | 1391 | 361 | 352 | 2104 | | Millepore | 591 | 196 | 178 | 965 | | No_acropore_encrusting | 460 | 212 | 211 | 883 | | No_acropore_massive | 1778 | 604 | 592 | 2974 | | No_acropore_sub_massive | 1563 | 439 | 443 | 2445 | | Rock | 2381 | 793 | 781 | 3955 | | Rubble | 2363 | 784 | 784 | 3931 | | Sand | 2401 | 802 | 801 | 4004 | | Sea_cucumber | 1116 | 313 | 287 | 1716 | | Sea_urchins | 158 | 64 | 89 | 311 | --- # Training procedure ## Training hyperparameters The following hyperparameters were used during training: - **Number of Epochs**: 37.0 - **Learning Rate**: 0.001 - **Train Batch Size**: 32 - **Eval Batch Size**: 32 - **Optimizer**: Adam - **LR Scheduler Type**: ReduceLROnPlateau with a patience of 5 epochs and a factor of 0.1 - **Freeze Encoder**: Yes - **Data Augmentation**: Yes ## Data Augmentation Data were augmented using the following transformations : Train Transforms - **PreProcess**: No additional parameters - **Resize**: probability=1.00 - **RandomHorizontalFlip**: probability=0.25 - **RandomVerticalFlip**: probability=0.25 - **ColorJiggle**: probability=0.25 - **RandomPerspective**: probability=0.25 - **Normalize**: probability=1.00 Val Transforms - **PreProcess**: No additional parameters - **Resize**: probability=1.00 - **Normalize**: probability=1.00 ## Training results Epoch | Validation Loss | MAE | RMSE | KL div | Learning Rate --- | --- | --- | --- | --- | --- 1 | 0.5543646216392517 | 0.2278 | 0.2605 | 1.7912 | 0.001 2 | 0.48171326518058777 | 0.1602 | 0.2007 | 1.0345 | 0.001 3 | 0.4615386724472046 | 0.1370 | 0.1801 | 0.6064 | 0.001 4 | 0.4632064700126648 | 0.1391 | 0.1837 | 0.6577 | 0.001 5 | 0.4579373300075531 | 0.1363 | 0.1769 | 0.6678 | 0.001 6 | 0.4571229815483093 | 0.1330 | 0.1766 | 0.7896 | 0.001 7 | 0.4586440622806549 | 0.1307 | 0.1773 | 0.6493 | 0.001 8 | 0.45786425471305847 | 0.1319 | 0.1772 | 0.9475 | 0.001 9 | 0.4551210105419159 | 0.1306 | 0.1746 | 0.7271 | 0.001 10 | 0.45817646384239197 | 0.1316 | 0.1774 | 0.6882 | 0.001 11 | 0.46834859251976013 | 0.1372 | 0.1842 | 0.3715 | 0.001 12 | 0.4578668475151062 | 0.1316 | 0.1764 | 0.5271 | 0.001 13 | 0.4558842182159424 | 0.1301 | 0.1756 | 0.9168 | 0.001 14 | 0.4555540680885315 | 0.1292 | 0.1749 | 0.8827 | 0.001 15 | 0.45217740535736084 | 0.1262 | 0.1717 | 0.7009 | 0.001 16 | 0.45556434988975525 | 0.1286 | 0.1753 | 1.0038 | 0.001 17 | 0.458648681640625 | 0.1343 | 0.1775 | 0.2600 | 0.001 18 | 0.567169725894928 | 0.1638 | 0.2369 | 2.0548 | 0.001 19 | 0.45287612080574036 | 0.1287 | 0.1727 | 0.7115 | 0.001 20 | 0.45518893003463745 | 0.1285 | 0.1746 | 0.9694 | 0.001 21 | 0.45299893617630005 | 0.1282 | 0.1724 | 0.7789 | 0.001 22 | 0.4502638280391693 | 0.1261 | 0.1700 | 0.7369 | 0.0001 23 | 0.453466534614563 | 0.1280 | 0.1716 | 0.5027 | 0.0001 24 | 0.4502425491809845 | 0.1264 | 0.1697 | 0.5968 | 0.0001 25 | 0.45040303468704224 | 0.1267 | 0.1699 | 0.6215 | 0.0001 26 | 0.4509589374065399 | 0.1260 | 0.1704 | 0.6568 | 0.0001 27 | 0.4497845768928528 | 0.1262 | 0.1693 | 0.5748 | 0.0001 28 | 0.45060041546821594 | 0.1256 | 0.1701 | 0.7001 | 0.0001 29 | 0.4504892826080322 | 0.1263 | 0.1699 | 0.5840 | 0.0001 30 | 0.45060065388679504 | 0.1252 | 0.1703 | 0.8101 | 0.0001 31 | 0.45080825686454773 | 0.1249 | 0.1701 | 0.7416 | 0.0001 32 | 0.4501984417438507 | 0.1254 | 0.1697 | 0.6402 | 0.0001 33 | 0.4510658085346222 | 0.1250 | 0.1710 | 0.8411 | 0.0001 34 | 0.45148056745529175 | 0.1259 | 0.1711 | 0.7204 | 1e-05 35 | 0.4502483904361725 | 0.1247 | 0.1698 | 0.7355 | 1e-05 36 | 0.4508889615535736 | 0.1261 | 0.1703 | 0.4990 | 1e-05 37 | 0.44998663663864136 | 0.1260 | 0.1696 | 0.5451 | 1e-05 --- # Framework Versions - **Transformers**: 4.48.0 - **Pytorch**: 2.5.1+cu124 - **Datasets**: 3.0.2 - **Tokenizers**: 0.21.0