phytoClassUCSC - A phytoplankton classifier for IFCB data
TRY IT OUT HERE:
https://colab.research.google.com/drive/1mv4xs8NHyyqls9OMfZ74HpzCLi9GlkTX?usp=sharing
Note: Sections and prompts from the model cards paper, v2.
Jump to section:
- Model details
- Intended use
- Factors
- Metrics
- Evaluation data
- Training data
- Quantitative analyses
- Ethical considerations
- Caveats and recommendations
Model details
- Developed by the Kudela Lab from the Ocean Sciences Department at University of California, Santa Cruz.
- Current version trained in February, 2023.
- Version 1.0
- phytoClassUCSC is a depthwise- CNN based on the Xception architecture Chollet, F., 2017 with 134 layers using weights pretrained on ImageNet.
- An average pooling layer is used.
- Licensed under CC-BY-SA-4.0
- For Questions email Patrick Daniel (pcdaniel@ucsc.edu)
Intended use
This model was designed and trained to work with IFCB data generated in Monterey Bay. While that does not mean it may not perform well in other locations, the distribution of training images reflects common phytoplankton observed at the Santa Cruz Wharf and Power Buoy locations.
Independent model validation should be used when applying the model to other sites.
Primary intended uses
Generalized micro-phytoplankton classifier for common taxa found in the Monterey Bay.
Primary intended users
Researchers intersted in a general.
Out-of-scope use cases
Observing and identifying rare or non-endemic taxa.
Factors
Model classes were chosen based on common and resolvable phytoplankton taxa. Taxonomic groupings were chosen based on what researchers in the lab felt groups that could be confidently identified, given the expertise and research intersts of the lab.
Instrument
Model was trained on images from Imaging FlowCytobot (IFCB) instruments primary deployed at the Santa Cruz Wharf and the Monterey Bay Aquarium Research Institute (MBARI) Power Buoy. The Santa Cruz Wharf IFCB (#104) is an early generation
Metrics
Deployed model performance will vary with the natural variabilability in the observed phytoplankton communities over different time scales (seasonality). As such model performance should be evaluated throughout IFCb deployments using independently labled images.
Model performance measures
Training model performace was evaluated using a held-back validation training set. F1-scores were calcuated for each class. See Results here
Approaches to uncertainty and variability
Uncertainty is addressed by applying a set of class-specific thresholds for each prediction. This works reasonably well for out-of-distribution images.
Training data
To Be Described
Ethical considerations
None
Caveats and recommendations
This model was developed as in interation of previous classification efforts and as such is subject to a history of decision making that is not captured here. For that reasons this classifier is not a panacea for all phytoplankton image data, but was specifically developed for looking at phytoplankton communities in Monterey Bay.
IFCB collected data are very context specific and subject to both observation configurations and small-scale variability.
Review section 4.9 of the model cards paper.