Transfer Learning Vision Transformer (ViT) - Google 224 ViT Base Patch

Description

This model is a Transfer Learning Vision Transformer (ViT) based on Google's 224 ViT Base Patch architecture. It has been fine-tuned on a dataset consisting of fungal images from Russia, with a specific focus on various fungi and lichen species.

Model Information

Model Name: Transfer Learning ViT - Google 224 ViT Base Patch
Model Architecture: Vision Transformer (ViT)
Base Architecture: Google's 224 ViT Base Patch
Pre-trained on General ImageNet dataset
Fine-tuned on: Fungal image dataset from Russia

Performance

Accuracy: 90.31%
F1 Score: 86.33%

Training Details

Training Loss:
- Initial: 1.043200
- Final: 0.116200
Validation Loss:
- Initial: 0.822428
- Final: 0.335994
Training Epochs: 10
Training Runtime: 18575.04 seconds
Training Samples per Second: 33.327
Training Steps per Second: 1.042
Total FLOPs: 4.801 x 10^19

Recommended Use Cases

Species classification of various fungi and lichen in Russia.
Fungal biodiversity studies.
Image recognition tasks related to fungi and lichen species.

Limitations

The model's performance is optimized for fungal species and may not generalize well to other domains.
The model may not perform well on images of fungi and lichen species from regions other than Russia.

Model Author

Siddhant Dutta