---
language:
- eng
license: cc0-1.0
tags:
- multilabel-image-classification
- multilabel
- generated_from_trainer
base_model: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs
model-index:
- name: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs
  results: []
---

DinoVdrone is a fine-tuned version of [DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs](https://huggingface.co/DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs). It achieves the following results on the test set:


- Loss: 0.4512
- RMSE: 0.1689
- MAE: 0.1261
- KL Divergence: 0.5558

---

# Model description
DinoVdrone is a model built on top of DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs model for underwater multilabel image classification.The classification head is a combination of linear, ReLU, batch normalization, and dropout layers.

The source code for training the model can be found in this [Git repository](https://github.com/SeatizenDOI/DinoVdeau).

- **Developed by:** [lombardata](https://huggingface.co/lombardata), credits to [César Leblanc](https://huggingface.co/CesarLeblanc) and [Victor Illien](https://huggingface.co/groderg)

---

# Intended uses & limitations
You can use the raw model for classify diverse marine species, encompassing coral morphotypes classes taken from the Global Coral Reef Monitoring Network (GCRMN), habitats classes and seagrass species.

---

# Training and evaluation data
Details on the estimated number of images for each class are given in the following table:
| Class                   |   train |   test |   val |   Total |
|:------------------------|--------:|-------:|------:|--------:|
| Acropore_branched       |     575 |    108 |    99 |     782 |
| Acropore_digitised      |     558 |    122 |   107 |     787 |
| Acropore_tabular        |     325 |    119 |   108 |     552 |
| Algae                   |    2354 |    778 |   776 |    3908 |
| Atra/Leucospilota       |     417 |     79 |    58 |     554 |
| Dead_coral              |    1778 |    485 |   503 |    2766 |
| Fish                    |    1391 |    361 |   352 |    2104 |
| Millepore               |     591 |    196 |   178 |     965 |
| No_acropore_encrusting  |     460 |    212 |   211 |     883 |
| No_acropore_massive     |    1778 |    604 |   592 |    2974 |
| No_acropore_sub_massive |    1563 |    439 |   443 |    2445 |
| Rock                    |    2381 |    793 |   781 |    3955 |
| Rubble                  |    2363 |    784 |   784 |    3931 |
| Sand                    |    2401 |    802 |   801 |    4004 |
| Sea_cucumber            |    1116 |    313 |   287 |    1716 |
| Sea_urchins             |     158 |     64 |    89 |     311 |

---

# Training procedure

## Training hyperparameters

The following hyperparameters were used during training:

- **Number of Epochs**: 37.0
- **Learning Rate**: 0.001
- **Train Batch Size**: 32
- **Eval Batch Size**: 32
- **Optimizer**: Adam
- **LR Scheduler Type**: ReduceLROnPlateau with a patience of 5 epochs and a factor of 0.1
- **Freeze Encoder**: Yes
- **Data Augmentation**: Yes


## Data Augmentation
Data were augmented using the following transformations :

Train Transforms
- **PreProcess**: No additional parameters
- **Resize**: probability=1.00
- **RandomHorizontalFlip**: probability=0.25
- **RandomVerticalFlip**: probability=0.25
- **ColorJiggle**: probability=0.25
- **RandomPerspective**: probability=0.25
- **Normalize**: probability=1.00

Val Transforms
- **PreProcess**: No additional parameters
- **Resize**: probability=1.00
- **Normalize**: probability=1.00


## Training results
Epoch | Validation Loss | MAE | RMSE | KL div | Learning Rate
--- | --- | --- | --- | --- | ---
1 | 0.5543646216392517 | 0.2278 | 0.2605 | 1.7912 | 0.001
2 | 0.48171326518058777 | 0.1602 | 0.2007 | 1.0345 | 0.001
3 | 0.4615386724472046 | 0.1370 | 0.1801 | 0.6064 | 0.001
4 | 0.4632064700126648 | 0.1391 | 0.1837 | 0.6577 | 0.001
5 | 0.4579373300075531 | 0.1363 | 0.1769 | 0.6678 | 0.001
6 | 0.4571229815483093 | 0.1330 | 0.1766 | 0.7896 | 0.001
7 | 0.4586440622806549 | 0.1307 | 0.1773 | 0.6493 | 0.001
8 | 0.45786425471305847 | 0.1319 | 0.1772 | 0.9475 | 0.001
9 | 0.4551210105419159 | 0.1306 | 0.1746 | 0.7271 | 0.001
10 | 0.45817646384239197 | 0.1316 | 0.1774 | 0.6882 | 0.001
11 | 0.46834859251976013 | 0.1372 | 0.1842 | 0.3715 | 0.001
12 | 0.4578668475151062 | 0.1316 | 0.1764 | 0.5271 | 0.001
13 | 0.4558842182159424 | 0.1301 | 0.1756 | 0.9168 | 0.001
14 | 0.4555540680885315 | 0.1292 | 0.1749 | 0.8827 | 0.001
15 | 0.45217740535736084 | 0.1262 | 0.1717 | 0.7009 | 0.001
16 | 0.45556434988975525 | 0.1286 | 0.1753 | 1.0038 | 0.001
17 | 0.458648681640625 | 0.1343 | 0.1775 | 0.2600 | 0.001
18 | 0.567169725894928 | 0.1638 | 0.2369 | 2.0548 | 0.001
19 | 0.45287612080574036 | 0.1287 | 0.1727 | 0.7115 | 0.001
20 | 0.45518893003463745 | 0.1285 | 0.1746 | 0.9694 | 0.001
21 | 0.45299893617630005 | 0.1282 | 0.1724 | 0.7789 | 0.001
22 | 0.4502638280391693 | 0.1261 | 0.1700 | 0.7369 | 0.0001
23 | 0.453466534614563 | 0.1280 | 0.1716 | 0.5027 | 0.0001
24 | 0.4502425491809845 | 0.1264 | 0.1697 | 0.5968 | 0.0001
25 | 0.45040303468704224 | 0.1267 | 0.1699 | 0.6215 | 0.0001
26 | 0.4509589374065399 | 0.1260 | 0.1704 | 0.6568 | 0.0001
27 | 0.4497845768928528 | 0.1262 | 0.1693 | 0.5748 | 0.0001
28 | 0.45060041546821594 | 0.1256 | 0.1701 | 0.7001 | 0.0001
29 | 0.4504892826080322 | 0.1263 | 0.1699 | 0.5840 | 0.0001
30 | 0.45060065388679504 | 0.1252 | 0.1703 | 0.8101 | 0.0001
31 | 0.45080825686454773 | 0.1249 | 0.1701 | 0.7416 | 0.0001
32 | 0.4501984417438507 | 0.1254 | 0.1697 | 0.6402 | 0.0001
33 | 0.4510658085346222 | 0.1250 | 0.1710 | 0.8411 | 0.0001
34 | 0.45148056745529175 | 0.1259 | 0.1711 | 0.7204 | 1e-05
35 | 0.4502483904361725 | 0.1247 | 0.1698 | 0.7355 | 1e-05
36 | 0.4508889615535736 | 0.1261 | 0.1703 | 0.4990 | 1e-05
37 | 0.44998663663864136 | 0.1260 | 0.1696 | 0.5451 | 1e-05


---

# Framework Versions

- **Transformers**: 4.48.0
- **Pytorch**: 2.5.1+cu124
- **Datasets**: 3.0.2
- **Tokenizers**: 0.21.0