phytoClassUCSC - A phytoplankton classifier for IFCB data

TRY IT OUT HERE:
https://colab.research.google.com/drive/1mv4xs8NHyyqls9OMfZ74HpzCLi9GlkTX?usp=sharing

Note: Sections and prompts from the model cards paper, v2.

Jump to section:

Model details
Intended use
Factors
Metrics
Evaluation data
Training data
Quantitative analyses
Ethical considerations
Caveats and recommendations

Model details

Developed by the Kudela Lab from the Ocean Sciences Department at University of California, Santa Cruz.
Current version trained in February, 2023.
Version 1.0
phytoClassUCSC is a depthwise- CNN based on the Xception architecture Chollet, F., 2017 with 134 layers using weights pretrained on ImageNet.
An average pooling layer is used.
Licensed under CC-BY-SA-4.0
For Questions email Patrick Daniel (pcdaniel@ucsc.edu)

Intended use

This model was designed and trained to work with IFCB data generated in Monterey Bay. While that does not mean it may not perform well in other locations, the distribution of training images reflects common phytoplankton observed at the Santa Cruz Wharf and Power Buoy locations.

Independent model validation should be used when applying the model to other sites.

Primary intended uses

Generalized micro-phytoplankton classifier for common taxa found in the Monterey Bay.

Primary intended users

Researchers intersted in a general.

Out-of-scope use cases

Observing and identifying rare or non-endemic taxa.

Factors

Model classes were chosen based on common and resolvable phytoplankton taxa. Taxonomic groupings were chosen based on what researchers in the lab felt groups that could be confidently identified, given the expertise and research intersts of the lab.

Instrument

Model was trained on images from Imaging FlowCytobot (IFCB) instruments primary deployed at the Santa Cruz Wharf and the Monterey Bay Aquarium Research Institute (MBARI) Power Buoy. The Santa Cruz Wharf IFCB (#104) is an early generation

Metrics

Deployed model performance will vary with the natural variabilability in the observed phytoplankton communities over different time scales (seasonality). As such model performance should be evaluated throughout IFCb deployments using independently labled images.

Model performance measures

Training model performace was evaluated using a held-back validation training set. F1-scores were calcuated for each class. See Results here

Approaches to uncertainty and variability

Uncertainty is addressed by applying a set of class-specific thresholds for each prediction. This works reasonably well for out-of-distribution images.

Training data

To Be Described

Ethical considerations

None

Caveats and recommendations

This model was developed as in interation of previous classification efforts and as such is subject to a history of decision making that is not captured here. For that reasons this classifier is not a panacea for all phytoplankton image data, but was specifically developed for looking at phytoplankton communities in Monterey Bay.

IFCB collected data are very context specific and subject to both observation configurations and small-scale variability.

Review section 4.9 of the model cards paper.