Upload README.md

e332204 verified about 1 month ago

6.01 kB


	---
	language:
	- eng
	license: cc0-1.0
	tags:
	- multilabel-image-classification
	- multilabel
	- generated_from_trainer
	base_model: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs
	model-index:
	- name: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs
	results: []
	---

	DinoVdrone is a fine-tuned version of [DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs](https://huggingface.co/DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs). It achieves the following results on the test set:


	- Loss: 0.4512
	- RMSE: 0.1689
	- MAE: 0.1261
	- KL Divergence: 0.5558

	---

	# Model description
	DinoVdrone is a model built on top of DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs model for underwater multilabel image classification.The classification head is a combination of linear, ReLU, batch normalization, and dropout layers.

	The source code for training the model can be found in this [Git repository](https://github.com/SeatizenDOI/DinoVdeau).

	- Developed by: [lombardata](https://huggingface.co/lombardata), credits to [César Leblanc](https://huggingface.co/CesarLeblanc) and [Victor Illien](https://huggingface.co/groderg)

	---

	# Intended uses & limitations
	You can use the raw model for classify diverse marine species, encompassing coral morphotypes classes taken from the Global Coral Reef Monitoring Network (GCRMN), habitats classes and seagrass species.

	---

	# Training and evaluation data
	Details on the estimated number of images for each class are given in the following table:
	\| Class \| train \| test \| val \| Total \|
	\|:------------------------\|--------:\|-------:\|------:\|--------:\|
	\| Acropore_branched \| 575 \| 108 \| 99 \| 782 \|
	\| Acropore_digitised \| 558 \| 122 \| 107 \| 787 \|
	\| Acropore_tabular \| 325 \| 119 \| 108 \| 552 \|
	\| Algae \| 2354 \| 778 \| 776 \| 3908 \|
	\| Atra/Leucospilota \| 417 \| 79 \| 58 \| 554 \|
	\| Dead_coral \| 1778 \| 485 \| 503 \| 2766 \|
	\| Fish \| 1391 \| 361 \| 352 \| 2104 \|
	\| Millepore \| 591 \| 196 \| 178 \| 965 \|
	\| No_acropore_encrusting \| 460 \| 212 \| 211 \| 883 \|
	\| No_acropore_massive \| 1778 \| 604 \| 592 \| 2974 \|
	\| No_acropore_sub_massive \| 1563 \| 439 \| 443 \| 2445 \|
	\| Rock \| 2381 \| 793 \| 781 \| 3955 \|
	\| Rubble \| 2363 \| 784 \| 784 \| 3931 \|
	\| Sand \| 2401 \| 802 \| 801 \| 4004 \|
	\| Sea_cucumber \| 1116 \| 313 \| 287 \| 1716 \|
	\| Sea_urchins \| 158 \| 64 \| 89 \| 311 \|

	---

	# Training procedure

	## Training hyperparameters

	The following hyperparameters were used during training:

	- Number of Epochs: 37.0
	- Learning Rate: 0.001
	- Train Batch Size: 32
	- Eval Batch Size: 32
	- Optimizer: Adam
	- LR Scheduler Type: ReduceLROnPlateau with a patience of 5 epochs and a factor of 0.1
	- Freeze Encoder: Yes
	- Data Augmentation: Yes


	## Data Augmentation
	Data were augmented using the following transformations :

	Train Transforms
	- PreProcess: No additional parameters
	- Resize: probability=1.00
	- RandomHorizontalFlip: probability=0.25
	- RandomVerticalFlip: probability=0.25
	- ColorJiggle: probability=0.25
	- RandomPerspective: probability=0.25
	- Normalize: probability=1.00

	Val Transforms
	- PreProcess: No additional parameters
	- Resize: probability=1.00
	- Normalize: probability=1.00



	## Training results
	Epoch \| Validation Loss \| MAE \| RMSE \| KL div \| Learning Rate
	--- \| --- \| --- \| --- \| --- \| ---
	1 \| 0.5543646216392517 \| 0.2278 \| 0.2605 \| 1.7912 \| 0.001
	2 \| 0.48171326518058777 \| 0.1602 \| 0.2007 \| 1.0345 \| 0.001
	3 \| 0.4615386724472046 \| 0.1370 \| 0.1801 \| 0.6064 \| 0.001
	4 \| 0.4632064700126648 \| 0.1391 \| 0.1837 \| 0.6577 \| 0.001
	5 \| 0.4579373300075531 \| 0.1363 \| 0.1769 \| 0.6678 \| 0.001
	6 \| 0.4571229815483093 \| 0.1330 \| 0.1766 \| 0.7896 \| 0.001
	7 \| 0.4586440622806549 \| 0.1307 \| 0.1773 \| 0.6493 \| 0.001
	8 \| 0.45786425471305847 \| 0.1319 \| 0.1772 \| 0.9475 \| 0.001
	9 \| 0.4551210105419159 \| 0.1306 \| 0.1746 \| 0.7271 \| 0.001
	10 \| 0.45817646384239197 \| 0.1316 \| 0.1774 \| 0.6882 \| 0.001
	11 \| 0.46834859251976013 \| 0.1372 \| 0.1842 \| 0.3715 \| 0.001
	12 \| 0.4578668475151062 \| 0.1316 \| 0.1764 \| 0.5271 \| 0.001
	13 \| 0.4558842182159424 \| 0.1301 \| 0.1756 \| 0.9168 \| 0.001
	14 \| 0.4555540680885315 \| 0.1292 \| 0.1749 \| 0.8827 \| 0.001
	15 \| 0.45217740535736084 \| 0.1262 \| 0.1717 \| 0.7009 \| 0.001
	16 \| 0.45556434988975525 \| 0.1286 \| 0.1753 \| 1.0038 \| 0.001
	17 \| 0.458648681640625 \| 0.1343 \| 0.1775 \| 0.2600 \| 0.001
	18 \| 0.567169725894928 \| 0.1638 \| 0.2369 \| 2.0548 \| 0.001
	19 \| 0.45287612080574036 \| 0.1287 \| 0.1727 \| 0.7115 \| 0.001
	20 \| 0.45518893003463745 \| 0.1285 \| 0.1746 \| 0.9694 \| 0.001
	21 \| 0.45299893617630005 \| 0.1282 \| 0.1724 \| 0.7789 \| 0.001
	22 \| 0.4502638280391693 \| 0.1261 \| 0.1700 \| 0.7369 \| 0.0001
	23 \| 0.453466534614563 \| 0.1280 \| 0.1716 \| 0.5027 \| 0.0001
	24 \| 0.4502425491809845 \| 0.1264 \| 0.1697 \| 0.5968 \| 0.0001
	25 \| 0.45040303468704224 \| 0.1267 \| 0.1699 \| 0.6215 \| 0.0001
	26 \| 0.4509589374065399 \| 0.1260 \| 0.1704 \| 0.6568 \| 0.0001
	27 \| 0.4497845768928528 \| 0.1262 \| 0.1693 \| 0.5748 \| 0.0001
	28 \| 0.45060041546821594 \| 0.1256 \| 0.1701 \| 0.7001 \| 0.0001
	29 \| 0.4504892826080322 \| 0.1263 \| 0.1699 \| 0.5840 \| 0.0001
	30 \| 0.45060065388679504 \| 0.1252 \| 0.1703 \| 0.8101 \| 0.0001
	31 \| 0.45080825686454773 \| 0.1249 \| 0.1701 \| 0.7416 \| 0.0001
	32 \| 0.4501984417438507 \| 0.1254 \| 0.1697 \| 0.6402 \| 0.0001
	33 \| 0.4510658085346222 \| 0.1250 \| 0.1710 \| 0.8411 \| 0.0001
	34 \| 0.45148056745529175 \| 0.1259 \| 0.1711 \| 0.7204 \| 1e-05
	35 \| 0.4502483904361725 \| 0.1247 \| 0.1698 \| 0.7355 \| 1e-05
	36 \| 0.4508889615535736 \| 0.1261 \| 0.1703 \| 0.4990 \| 1e-05
	37 \| 0.44998663663864136 \| 0.1260 \| 0.1696 \| 0.5451 \| 1e-05


	---

	# Framework Versions

	- Transformers: 4.48.0
	- Pytorch: 2.5.1+cu124
	- Datasets: 3.0.2
	- Tokenizers: 0.21.0


	---
	language:
	- eng
	license: cc0-1.0
	tags:
	- multilabel-image-classification
	- multilabel
	- generated_from_trainer
	base_model: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs
	model-index:
	- name: DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs
	results: []
	---

	DinoVdrone is a fine-tuned version of [DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs](https://huggingface.co/DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs). It achieves the following results on the test set:


	- Loss: 0.4512
	- RMSE: 0.1689
	- MAE: 0.1261
	- KL Divergence: 0.5558

	---

	# Model description
	DinoVdrone is a model built on top of DinoVdrone-large-2025_02_03_31850-bs32_freeze_probs model for underwater multilabel image classification.The classification head is a combination of linear, ReLU, batch normalization, and dropout layers.

	The source code for training the model can be found in this [Git repository](https://github.com/SeatizenDOI/DinoVdeau).

	- Developed by: [lombardata](https://huggingface.co/lombardata), credits to [César Leblanc](https://huggingface.co/CesarLeblanc) and [Victor Illien](https://huggingface.co/groderg)

	---

	# Intended uses & limitations
	You can use the raw model for classify diverse marine species, encompassing coral morphotypes classes taken from the Global Coral Reef Monitoring Network (GCRMN), habitats classes and seagrass species.

	---

	# Training and evaluation data
	Details on the estimated number of images for each class are given in the following table:
	\| Class \| train \| test \| val \| Total \|
	\|:------------------------\|--------:\|-------:\|------:\|--------:\|
	\| Acropore_branched \| 575 \| 108 \| 99 \| 782 \|
	\| Acropore_digitised \| 558 \| 122 \| 107 \| 787 \|
	\| Acropore_tabular \| 325 \| 119 \| 108 \| 552 \|
	\| Algae \| 2354 \| 778 \| 776 \| 3908 \|
	\| Atra/Leucospilota \| 417 \| 79 \| 58 \| 554 \|
	\| Dead_coral \| 1778 \| 485 \| 503 \| 2766 \|
	\| Fish \| 1391 \| 361 \| 352 \| 2104 \|
	\| Millepore \| 591 \| 196 \| 178 \| 965 \|
	\| No_acropore_encrusting \| 460 \| 212 \| 211 \| 883 \|
	\| No_acropore_massive \| 1778 \| 604 \| 592 \| 2974 \|
	\| No_acropore_sub_massive \| 1563 \| 439 \| 443 \| 2445 \|
	\| Rock \| 2381 \| 793 \| 781 \| 3955 \|
	\| Rubble \| 2363 \| 784 \| 784 \| 3931 \|
	\| Sand \| 2401 \| 802 \| 801 \| 4004 \|
	\| Sea_cucumber \| 1116 \| 313 \| 287 \| 1716 \|
	\| Sea_urchins \| 158 \| 64 \| 89 \| 311 \|

	---

	# Training procedure

	## Training hyperparameters

	The following hyperparameters were used during training:

	- Number of Epochs: 37.0
	- Learning Rate: 0.001
	- Train Batch Size: 32
	- Eval Batch Size: 32
	- Optimizer: Adam
	- LR Scheduler Type: ReduceLROnPlateau with a patience of 5 epochs and a factor of 0.1
	- Freeze Encoder: Yes
	- Data Augmentation: Yes


	## Data Augmentation
	Data were augmented using the following transformations :

	Train Transforms
	- PreProcess: No additional parameters
	- Resize: probability=1.00
	- RandomHorizontalFlip: probability=0.25
	- RandomVerticalFlip: probability=0.25
	- ColorJiggle: probability=0.25
	- RandomPerspective: probability=0.25
	- Normalize: probability=1.00

	Val Transforms
	- PreProcess: No additional parameters
	- Resize: probability=1.00
	- Normalize: probability=1.00



	## Training results
	Epoch \| Validation Loss \| MAE \| RMSE \| KL div \| Learning Rate
	--- \| --- \| --- \| --- \| --- \| ---
	1 \| 0.5543646216392517 \| 0.2278 \| 0.2605 \| 1.7912 \| 0.001
	2 \| 0.48171326518058777 \| 0.1602 \| 0.2007 \| 1.0345 \| 0.001
	3 \| 0.4615386724472046 \| 0.1370 \| 0.1801 \| 0.6064 \| 0.001
	4 \| 0.4632064700126648 \| 0.1391 \| 0.1837 \| 0.6577 \| 0.001
	5 \| 0.4579373300075531 \| 0.1363 \| 0.1769 \| 0.6678 \| 0.001
	6 \| 0.4571229815483093 \| 0.1330 \| 0.1766 \| 0.7896 \| 0.001
	7 \| 0.4586440622806549 \| 0.1307 \| 0.1773 \| 0.6493 \| 0.001
	8 \| 0.45786425471305847 \| 0.1319 \| 0.1772 \| 0.9475 \| 0.001
	9 \| 0.4551210105419159 \| 0.1306 \| 0.1746 \| 0.7271 \| 0.001
	10 \| 0.45817646384239197 \| 0.1316 \| 0.1774 \| 0.6882 \| 0.001
	11 \| 0.46834859251976013 \| 0.1372 \| 0.1842 \| 0.3715 \| 0.001
	12 \| 0.4578668475151062 \| 0.1316 \| 0.1764 \| 0.5271 \| 0.001
	13 \| 0.4558842182159424 \| 0.1301 \| 0.1756 \| 0.9168 \| 0.001
	14 \| 0.4555540680885315 \| 0.1292 \| 0.1749 \| 0.8827 \| 0.001
	15 \| 0.45217740535736084 \| 0.1262 \| 0.1717 \| 0.7009 \| 0.001
	16 \| 0.45556434988975525 \| 0.1286 \| 0.1753 \| 1.0038 \| 0.001
	17 \| 0.458648681640625 \| 0.1343 \| 0.1775 \| 0.2600 \| 0.001
	18 \| 0.567169725894928 \| 0.1638 \| 0.2369 \| 2.0548 \| 0.001
	19 \| 0.45287612080574036 \| 0.1287 \| 0.1727 \| 0.7115 \| 0.001
	20 \| 0.45518893003463745 \| 0.1285 \| 0.1746 \| 0.9694 \| 0.001
	21 \| 0.45299893617630005 \| 0.1282 \| 0.1724 \| 0.7789 \| 0.001
	22 \| 0.4502638280391693 \| 0.1261 \| 0.1700 \| 0.7369 \| 0.0001
	23 \| 0.453466534614563 \| 0.1280 \| 0.1716 \| 0.5027 \| 0.0001
	24 \| 0.4502425491809845 \| 0.1264 \| 0.1697 \| 0.5968 \| 0.0001
	25 \| 0.45040303468704224 \| 0.1267 \| 0.1699 \| 0.6215 \| 0.0001
	26 \| 0.4509589374065399 \| 0.1260 \| 0.1704 \| 0.6568 \| 0.0001
	27 \| 0.4497845768928528 \| 0.1262 \| 0.1693 \| 0.5748 \| 0.0001
	28 \| 0.45060041546821594 \| 0.1256 \| 0.1701 \| 0.7001 \| 0.0001
	29 \| 0.4504892826080322 \| 0.1263 \| 0.1699 \| 0.5840 \| 0.0001
	30 \| 0.45060065388679504 \| 0.1252 \| 0.1703 \| 0.8101 \| 0.0001
	31 \| 0.45080825686454773 \| 0.1249 \| 0.1701 \| 0.7416 \| 0.0001
	32 \| 0.4501984417438507 \| 0.1254 \| 0.1697 \| 0.6402 \| 0.0001
	33 \| 0.4510658085346222 \| 0.1250 \| 0.1710 \| 0.8411 \| 0.0001
	34 \| 0.45148056745529175 \| 0.1259 \| 0.1711 \| 0.7204 \| 1e-05
	35 \| 0.4502483904361725 \| 0.1247 \| 0.1698 \| 0.7355 \| 1e-05
	36 \| 0.4508889615535736 \| 0.1261 \| 0.1703 \| 0.4990 \| 1e-05
	37 \| 0.44998663663864136 \| 0.1260 \| 0.1696 \| 0.5451 \| 1e-05


	---

	# Framework Versions

	- Transformers: 4.48.0
	- Pytorch: 2.5.1+cu124
	- Datasets: 3.0.2
	- Tokenizers: 0.21.0