SpecLab Model Card

This model card focuses on the model associated with the SpecLab space on Hugging Face. Temporarily, please contact me for the demo.

Model Details

Developed by: Haoli Yin
Model type: Atrous Spatial Pyramid Pooling (ASPP) model for Specular Reflection Segmentation in Endoscopic Images
Language(s): English
License: GPL 3.0
Model Description: This is a model that can be used to create dense pixel-wise segmentation masks of detected specular reflections from an endoscopy image.
Cite as:

@misc{Yin_SpecLab_2022,
      author = {Yin, Haoli},
      doi = {TBD},
      month = {8},
      title = {SpecLab},
      url = {https://github.com/Nano1337/SpecLab},
      year = {2022}
}

Uses

Direct Use

The model is intended to be used to generate dense pixel-wise segmentation maps of specular reflection regions found in endoscopy images. Intended uses exclude those described in the Misuse and Out-of-Scope Use section.

Downstream Use

The model could also be used for downstream use cases, including further research efforts, such as detecting specular reflection in other real-world scenarios. This application would require fine-tuning the model with domain-specific datasets.

Limitations and Bias

Limitations

The performance of the model may degrade when applied on non-biological tissue images. There may also be edge cases causing the model to fail to detect specular reflection, especially if the specular reflection present is a different color than white.

Bias

The model is trained on endoscopy video data, so it has a bias towards detecting specular reflection better on biological tissue backgrounds.

Limitations and Bias Recommendations

Users (both direct and downstream) should be made aware of the biases and limitations.
Further work on this model should include methods for balanced representations of different types of specular reflections.

Training

Training Data

The GLENDA "no pathology" dataset was used to train the model:

GLENDA Dataset, which contains ~12k image frames.
Masks (to be released), were generated using the specular reflection detection pipeline found in this paper (to be released).
Train/Val/Test was split randomly based on a 60/20/20 distribution.

Training and Evaluation Procedure & Results

You can view the training logs here at Weights and Biases

During training, input images pass through the system as follows:

Images are transformed by albumentations with horizontal/vertical flips to augment the data, normalized to [0, 1], and converted to a tensor.
A forward pass is run through the model and the logits are output
Loss is the "Binary Cross Entropy with Logits Loss" between the model prediction logits and the ground truth masks
The logits are run through a sigmoid activation function and a threshold at 0.5 is set to binarize the output.

The simplified training procedure for SpecLab is as follows:

Hardware: One 16GB NVIDIA Tesla V100-SXM2
Optimizer: Adam
Batch: 4 samples
Learning rate: initialized at 0.001 then CosineAnnealingLR with a T_max of 20.
Epochs: 10 epochs
Steps: 18k

Environmental Impact

SpecLab Estimated Emissions

Based on that information, we estimate the following CO2 emissions using the Machine Learning Impact calculator presented in Lacoste et al. (2019). The hardware, runtime, cloud provider, and compute region were utilized to estimate the carbon impact.

Hardware Type: Tesla V100-SXM2
Hours used: 6
Cloud Provider: Google Colab
Compute Region: us-south1
Carbon Emitted (Power consumption x Time x Carbon produced based on location of power grid): 0.7146 kg CO2 eq.

Citation

@misc{Yin_SpecLab_2022,
      author = {Yin, Haoli},
      doi = {TBD},
      month = {8},
      title = {SpecLab},
      url = {https://github.com/Nano1337/SpecLab},
      year = {2022}
}

This model card was written by: Haoli Yin

Nano1337
/

SpecLab