|
--- |
|
license: gpl-3.0 |
|
--- |
|
|
|
# Breast Estrogen Receptor (ER) GAN v1 Model Card |
|
This model card describes a model associated with a manuscript that is currently under review. Links to the manuscript will be provided once publicly available. |
|
|
|
## Model Details |
|
- **Developed by:** James Dolezal |
|
- **Model type:** Generative adversarial network |
|
- **Language(s):** English |
|
- **License:** GPL-3.0 |
|
- **Model Description:** This is a StyleGAN2 model that can generate synthetic H&E pathologic images of breast cancer. The GAN is conditioned on estrogen receptor (ER) status as determined by immunohistochemical testing, with categories ER-negative (=0) and ER-positive (=1). |
|
- **Image processing:** This model generates images at 512 x 512 px resolution and was trained on lossless (PNG) pathologic images at 400 x 400 μm magnification. |
|
- **Resources for more information:** [GitHub Repository](https://github.com/jamesdolezal/histologic-sheep) |
|
|
|
# Uses |
|
|
|
## Examples |
|
This model is a [StyleGAN2](https://github.com/NVlabs/stylegan3) model and can be used with any StyleGAN-compatible scripts and tools. The [GitHub repository](https://github.com/jamesdolezal/histologic-sheep) associated with his model includes detailed information on how to interface with the GAN, generate images, and perform class blending via embedding interpolation. |
|
|
|
## Direct Use |
|
This model is intended for research purposes only. Possible research areas and tasks include |
|
|
|
- Applications in educational settings. |
|
- Research on pathology classification models for breast cancer. |
|
|
|
Excluded uses are described below. |
|
|
|
### Misuse and Out-of-Scope Use |
|
Output from this model should not be used in a clinical setting or be provided to patients, physicians, or any other health care members directly involved in their health care outside the context of an approved research protocol. Using the model in a clinical setting outside the context of an approved research protocol is a misuse of this model. This includes influencing a patient's health care treatment in any way based on output from this model. |
|
|
|
### Limitations |
|
|
|
The model does not generate images reflective of estrogen receptor status in a manner which controls for possible underlying biological bias, such tumor grade or histological subtype. |
|
|
|
### Bias |
|
This model was trained on The Cancer Genome Atlas (TCGA), which contains patient data from communities and cultures which may not reflect the general population. This datasets is comprised of images from multiple institutions, which may introduce a potential source of bias from site-specific batch effects ([Howard, 2021](https://www.nature.com/articles/s41467-021-24698-1)). |
|
|
|
## Training |
|
|
|
**Training Data** |
|
The following dataset was used to train the model: |
|
|
|
- The Cancer Genome Atlas (TCGA), THCA cohort (see next section) |
|
|
|
This model was trained on a total of 1,048 slides, with 228 ER-negative tumor and 820 ER-positive tumors. |
|
|
|
**Training Procedure** |
|
Each whole-slide image was sectioned into smaller images in a grid-wise fashion in order to extract tiles from whole-slide images at 400 x 400 μm. Image tiles were extracted at the nearest downsample layer, and resized to 512 x 512 px using [Libvips](https://www.libvips.org/API/current/libvips-resample.html#vips-resize). During training, images are randomly flipped and rotated (90, 180, 270). Training is otherwise identical to the official StyleGAN2 implementation. |
|
|
|
Additional training information: |
|
|
|
- **Hardware:** 4 x A100 GPUs |
|
- **Batch size:** 32 |
|
- **R1 gamma:** 1.6384 |
|
- **Training time:** 10,000 kimg |
|
|
|
## Evaluation Results |
|
External evaluation results are currently under peer review and will be posted once publicly available. |