afiliot commited on
Commit
e5af21c
1 Parent(s): ceb92bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -38
README.md CHANGED
@@ -73,67 +73,41 @@ assert features.shape == (1, 1024)
73
  ### Direct Use (with Pre-Extracted and Frozen Features)
74
 
75
  Phikon-v2 can be used with or without fine-tuning on different downstream applications, on top of which slide-classification using multiple instance learning algorithms (such as ABMIL).
76
- [This Colab notebook](https://colab.research.google.com/drive/1zjxscEBgpizHBCwMy-aNz2916AVdB642) allows you to fine-tune Phikon and Phikon-v2 using LoRa through the huggingface API.
77
 
78
  ### Downstream Use (Finetuning)
79
 
80
- You can fine-tune the model
 
81
 
82
  ## Training Details
83
 
84
- - **Training data:** Mass-100K, a pretraining dataset (sourced from MGH, BWH, and GTEx) composed of 75,832,905 [256×256] and 24,297,995 [512×512] histology images at 20× resolution, sampled from 100,402 H&E WSIs (100,130,900 images in total).
85
  - **Training regime:** fp16 using PyTorch-FSDP mixed-precision.
86
  - **Training objective:** DINOv2 SSL recipe with the following losses:
87
  - DINO self-distillation loss with multi-crop
88
  - iBOT masked-image modeling loss
89
  - KoLeo regularization on [CLS] tokens
90
- - **Training length:** 125,000 iterations with a batch size of 3072
91
  - **Model architecture:** ViT-Large (0.3B params): Patch size 16, embedding dimension 1024, 16 heads, MLP FFN
92
- - **Hardware used:** 4x8 Nvidia A100 80GB
93
- - **Hours trained:** Approx 1024 GPU hours (32 hours total)
94
- - **Cloud provider:** MGB ERIS Research Computing Core
95
 
96
  ## Software Dependencies
97
 
98
  **Python Packages**
99
- - torch>=2.0: https://pytorch.org
 
100
  - xformers>=0.0.18: https://github.com/facebookresearch/xformers
101
- - timm>=0.9.8: https://github.com/huggingface/pytorch-image-models
102
 
103
  **Repositories**
104
  - DINOv2 (self-supervised learning): https://github.com/facebookresearch/dinov2
105
- - CLAM (slide classification): https://github.com/mahmoodlab/CLAM
106
- - Mask2Former (cell and tissue segmentation): https://github.com/facebookresearch/Mask2Former
107
- - ViT-Adapter (cell and tissue segmentation): https://github.com/czczup/ViT-Adapter
108
- - LGSSL (Linear Probe & Few-Shot Eval): https://github.com/mbanani/lgssl
109
-
110
- ## License and Terms of Use
111
- This model and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution. Any commercial use, sale, or other monetization of the UNI model and its derivatives, which include models trained on outputs from the UNI model or datasets created from the UNI model, is prohibited and requires prior approval. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading this model, you agree not to distribute, publish or reproduce a copy of the model. If another user within your organization wishes to use the UNI model, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying model. If you are a commercial entity, please contact the corresponding author.
112
 
113
 
114
  ## Contact
115
- For any additional questions or comments, contact Faisal Mahmood (`faisalmahmood@bwh.harvard.edu`),
116
- Richard J. Chen (`richardchen@g.harvard.edu`),
117
- Tong Ding (`tong_ding@g.harvard.edu`),
118
- or Ming Y. Lu (`mlu16@bwh.harvard.edu`).
119
 
120
 
121
  ## Acknowledgements
122
- The project was built on top of amazing repositories such as [ViT](https://github.com/google-research/big_vision), [DINOv2](https://github.com/facebookresearch/dinov2), [LGSSL](https://github.com/mbanani/lgssl), and [Timm](https://github.com/huggingface/pytorch-image-models/) (ViT model implementation). We thank the authors and developers for their contribution.
123
-
124
-
125
- ## BibTeX
126
- If you found our work useful in your research, please consider citing our work at:
127
-
128
- Chen, R.J., Ding, T., Lu, M.Y., Williamson, D.F.K., et al. Towards a general-purpose foundation model for computational pathology. Nat Med (2024). https://doi.org/10.1038/s41591-024-02857-3
129
-
130
- ```
131
- @article{chen2024uni,
132
- title={Towards a General-Purpose Foundation Model for Computational Pathology},
133
- author={Chen, Richard J and Ding, Tong and Lu, Ming Y and Williamson, Drew FK and Jaume, Guillaume and Chen, Bowen and Zhang, Andrew and Shao, Daniel and Song, Andrew H and Shaban, Muhammad and others},
134
- journal={Nature Medicine},
135
- publisher={Nature Publishing Group},
136
- year={2024}
137
- }
138
- ```
139
- Works that use UNI should also attribute ViT and DINOv2.
 
73
  ### Direct Use (with Pre-Extracted and Frozen Features)
74
 
75
  Phikon-v2 can be used with or without fine-tuning on different downstream applications, on top of which slide-classification using multiple instance learning algorithms (such as ABMIL).
 
76
 
77
  ### Downstream Use (Finetuning)
78
 
79
+ You can fine-tune the model on tile-level downstream tasks.
80
+ [This Colab notebook](https://colab.research.google.com/drive/1zjxscEBgpizHBCwMy-aNz2916AVdB642) allows you to fine-tune Phikon and Phikon-v2 using LoRa through the huggingface API.
81
 
82
  ## Training Details
83
 
84
+ - **Training data:** PANCAN-XL, a pretraining dataset composed of 456,060,584 [224×224] histology images at 20× resolution, sampled from 60k H&E WSIs.
85
  - **Training regime:** fp16 using PyTorch-FSDP mixed-precision.
86
  - **Training objective:** DINOv2 SSL recipe with the following losses:
87
  - DINO self-distillation loss with multi-crop
88
  - iBOT masked-image modeling loss
89
  - KoLeo regularization on [CLS] tokens
90
+ - **Training length:** 100,000 iterations with a batch size of 4,096
91
  - **Model architecture:** ViT-Large (0.3B params): Patch size 16, embedding dimension 1024, 16 heads, MLP FFN
92
+ - **Hardware used:** 32x4 Nvidia V100 32GB
93
+ - **Hours trained ??:** Approx 4,300 GPU hours (33 hours total)
94
+ - **Platform**: French supercluster Jean-Zay
95
 
96
  ## Software Dependencies
97
 
98
  **Python Packages**
99
+ - torch>==2.0.0: https://pytorch.org
100
+ - torchvision>=0.15.0: https://pytorch.org/vision/stable/index.html
101
  - xformers>=0.0.18: https://github.com/facebookresearch/xformers
 
102
 
103
  **Repositories**
104
  - DINOv2 (self-supervised learning): https://github.com/facebookresearch/dinov2
 
 
 
 
 
 
 
105
 
106
 
107
  ## Contact
108
+ For any additional questions or comments, contact Alexandre Filiot (`alexandre.filiot@owkwin.com`).
 
 
 
109
 
110
 
111
  ## Acknowledgements
112
+ We thank [DINOv2](https://github.com/facebookresearch/dinov2) authors for the amazing contribution.
113
+ This work was granted access to the HPC resources of IDRIS under the allocation 2023-A0141012519 made by GENCI.