groderg's picture
Upload README.md
e332204 verified
---
language:
- eng
license: cc0-1.0
tags:
- multilabel-image-classification
- multilabel
- generated_from_trainer
base_model: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs
model-index:
- name: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs
results: []
---
DinoVdrone is a fine-tuned version of [DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs](https://huggingface.co/DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs). It achieves the following results on the test set:
- Loss: 0.4512
- RMSE: 0.1689
- MAE: 0.1261
- KL Divergence: 0.5558
---
# Model description
DinoVdrone is a model built on top of DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs model for underwater multilabel image classification.The classification head is a combination of linear, ReLU, batch normalization, and dropout layers.
The source code for training the model can be found in this [Git repository](https://github.com/SeatizenDOI/DinoVdeau).
- **Developed by:** [lombardata](https://huggingface.co/lombardata), credits to [César Leblanc](https://huggingface.co/CesarLeblanc) and [Victor Illien](https://huggingface.co/groderg)
---
# Intended uses & limitations
You can use the raw model for classify diverse marine species, encompassing coral morphotypes classes taken from the Global Coral Reef Monitoring Network (GCRMN), habitats classes and seagrass species.
---
# Training and evaluation data
Details on the estimated number of images for each class are given in the following table:
| Class | train | test | val | Total |
|:------------------------|--------:|-------:|------:|--------:|
| Acropore_branched | 575 | 108 | 99 | 782 |
| Acropore_digitised | 558 | 122 | 107 | 787 |
| Acropore_tabular | 325 | 119 | 108 | 552 |
| Algae | 2354 | 778 | 776 | 3908 |
| Atra/Leucospilota | 417 | 79 | 58 | 554 |
| Dead_coral | 1778 | 485 | 503 | 2766 |
| Fish | 1391 | 361 | 352 | 2104 |
| Millepore | 591 | 196 | 178 | 965 |
| No_acropore_encrusting | 460 | 212 | 211 | 883 |
| No_acropore_massive | 1778 | 604 | 592 | 2974 |
| No_acropore_sub_massive | 1563 | 439 | 443 | 2445 |
| Rock | 2381 | 793 | 781 | 3955 |
| Rubble | 2363 | 784 | 784 | 3931 |
| Sand | 2401 | 802 | 801 | 4004 |
| Sea_cucumber | 1116 | 313 | 287 | 1716 |
| Sea_urchins | 158 | 64 | 89 | 311 |
---
# Training procedure
## Training hyperparameters
The following hyperparameters were used during training:
- **Number of Epochs**: 37.0
- **Learning Rate**: 0.001
- **Train Batch Size**: 32
- **Eval Batch Size**: 32
- **Optimizer**: Adam
- **LR Scheduler Type**: ReduceLROnPlateau with a patience of 5 epochs and a factor of 0.1
- **Freeze Encoder**: Yes
- **Data Augmentation**: Yes
## Data Augmentation
Data were augmented using the following transformations :
Train Transforms
- **PreProcess**: No additional parameters
- **Resize**: probability=1.00
- **RandomHorizontalFlip**: probability=0.25
- **RandomVerticalFlip**: probability=0.25
- **ColorJiggle**: probability=0.25
- **RandomPerspective**: probability=0.25
- **Normalize**: probability=1.00
Val Transforms
- **PreProcess**: No additional parameters
- **Resize**: probability=1.00
- **Normalize**: probability=1.00
## Training results
Epoch | Validation Loss | MAE | RMSE | KL div | Learning Rate
--- | --- | --- | --- | --- | ---
1 | 0.5543646216392517 | 0.2278 | 0.2605 | 1.7912 | 0.001
2 | 0.48171326518058777 | 0.1602 | 0.2007 | 1.0345 | 0.001
3 | 0.4615386724472046 | 0.1370 | 0.1801 | 0.6064 | 0.001
4 | 0.4632064700126648 | 0.1391 | 0.1837 | 0.6577 | 0.001
5 | 0.4579373300075531 | 0.1363 | 0.1769 | 0.6678 | 0.001
6 | 0.4571229815483093 | 0.1330 | 0.1766 | 0.7896 | 0.001
7 | 0.4586440622806549 | 0.1307 | 0.1773 | 0.6493 | 0.001
8 | 0.45786425471305847 | 0.1319 | 0.1772 | 0.9475 | 0.001
9 | 0.4551210105419159 | 0.1306 | 0.1746 | 0.7271 | 0.001
10 | 0.45817646384239197 | 0.1316 | 0.1774 | 0.6882 | 0.001
11 | 0.46834859251976013 | 0.1372 | 0.1842 | 0.3715 | 0.001
12 | 0.4578668475151062 | 0.1316 | 0.1764 | 0.5271 | 0.001
13 | 0.4558842182159424 | 0.1301 | 0.1756 | 0.9168 | 0.001
14 | 0.4555540680885315 | 0.1292 | 0.1749 | 0.8827 | 0.001
15 | 0.45217740535736084 | 0.1262 | 0.1717 | 0.7009 | 0.001
16 | 0.45556434988975525 | 0.1286 | 0.1753 | 1.0038 | 0.001
17 | 0.458648681640625 | 0.1343 | 0.1775 | 0.2600 | 0.001
18 | 0.567169725894928 | 0.1638 | 0.2369 | 2.0548 | 0.001
19 | 0.45287612080574036 | 0.1287 | 0.1727 | 0.7115 | 0.001
20 | 0.45518893003463745 | 0.1285 | 0.1746 | 0.9694 | 0.001
21 | 0.45299893617630005 | 0.1282 | 0.1724 | 0.7789 | 0.001
22 | 0.4502638280391693 | 0.1261 | 0.1700 | 0.7369 | 0.0001
23 | 0.453466534614563 | 0.1280 | 0.1716 | 0.5027 | 0.0001
24 | 0.4502425491809845 | 0.1264 | 0.1697 | 0.5968 | 0.0001
25 | 0.45040303468704224 | 0.1267 | 0.1699 | 0.6215 | 0.0001
26 | 0.4509589374065399 | 0.1260 | 0.1704 | 0.6568 | 0.0001
27 | 0.4497845768928528 | 0.1262 | 0.1693 | 0.5748 | 0.0001
28 | 0.45060041546821594 | 0.1256 | 0.1701 | 0.7001 | 0.0001
29 | 0.4504892826080322 | 0.1263 | 0.1699 | 0.5840 | 0.0001
30 | 0.45060065388679504 | 0.1252 | 0.1703 | 0.8101 | 0.0001
31 | 0.45080825686454773 | 0.1249 | 0.1701 | 0.7416 | 0.0001
32 | 0.4501984417438507 | 0.1254 | 0.1697 | 0.6402 | 0.0001
33 | 0.4510658085346222 | 0.1250 | 0.1710 | 0.8411 | 0.0001
34 | 0.45148056745529175 | 0.1259 | 0.1711 | 0.7204 | 1e-05
35 | 0.4502483904361725 | 0.1247 | 0.1698 | 0.7355 | 1e-05
36 | 0.4508889615535736 | 0.1261 | 0.1703 | 0.4990 | 1e-05
37 | 0.44998663663864136 | 0.1260 | 0.1696 | 0.5451 | 1e-05
---
# Framework Versions
- **Transformers**: 4.48.0
- **Pytorch**: 2.5.1+cu124
- **Datasets**: 3.0.2
- **Tokenizers**: 0.21.0