MUmairAB commited on
Commit
09d60af
1 Parent(s): 3bb21b6

Push Keras model using huggingface_hub.

Browse files
README.md CHANGED
@@ -1,87 +1,50 @@
1
  ---
2
- license: mit
3
- pipeline_tag: image-classification
4
- tags:
5
- - keras
6
- - breast cancer detection
7
- - histopathology images
8
- - invasive ductal carcinoma
9
- - convolutional neural network
10
- - medical image processing
11
- - umair akram
12
  ---
13
 
14
- # Breast Cancer Detection using CNNs in TensorFlow
15
 
16
- In this project, a Convolutional Neural Network (CNN) is employed for the purpose of detecting Breast Cancer. The algorithm takes patches of **Histopathological Images** of patients' breast tissues and utilizes CNNs to ascertain whether the breast tissues within the image patch contain **Invasive Ductal Carcinoma** (**IDC**) or not. By analyzing the individual patches instead of the entire breast image, our model enables precise detection of cancer tissues at a localized level.
17
 
18
- ## Stats about breast cancer
19
 
20
- According to the World Health Organization (WHO), in 2020 alone, there were [2.3 million](https://www.who.int/news-room/fact-sheets/detail/breast-cancer) reported cases of breast cancer among women, resulting in **685,000** deaths worldwide. By the end of the same year, there were approximately **7.8 million** women who had been diagnosed with breast cancer within the past five years, establishing it as the most prevalent form of cancer globally.
21
 
22
- Worldwide, female breast cancer is the [fifth](https://www.cancer.net/cancer-types/breast-cancer/statistics#:~:text=It%20is%20estimated%20that%2043%2C700,world%20died%20from%20breast%20cancer.) leading cause of death.
23
 
24
- ## Introduction
25
 
26
- As stated by [Pamela Wright](https://www.hopkinsmedicine.org/health/conditions-and-diseases/breast-cancer/invasive-ductal-carcinoma-idc), the medical director of the Breast Center at Johns Hopkins, **Invasive Ductal Carcinoma** (**IDC**), also referred to as **infiltrating ductal carcinoma**, is the predominant type of breast cancer. It represents 80% of all breast cancer diagnoses. For further information about IDC, please refer to this [article](https://www.breastcancer.org/types/invasive-ductal-carcinoma). In the context of this project, we have developed a classification model based on Convolutional Neural Networks (CNNs) using TensorFlow. The model utilizes **Histopathology Images** of patients to classify whether they have breast cancer or not!
27
 
28
- ## Dataset
29
- We are using the dataset from Kaggle. It can be accessed [here](https://www.kaggle.com/datasets/paultimothymooney/breast-histopathology-images/code?datasetId=7415&sortBy=voteCount). Following are some of the properties of this dataset:
30
 
31
- The initial dataset comprised 162 slide images of breast cancer specimens scanned at a magnification of 40x. Due to their large dimensions, 277,524 patches measuring 50×50 pixels were extracted from these images to improve their manageability. These patches encompass the regions that contain Invasive Ductal Carcinoma (IDC), thereby enabling more efficient processing and analysis.
32
 
33
- * 198,738 negative examples (i.e., no breast cancer)
34
- * 78,786 positive examples (i.e., indicating breast cancer)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
- The dataset assigns a unique filename structure to each image, like:
37
- ```
38
- u_xX_yY_classC.png
39
 
40
- ```
41
- For example:
42
- ```
43
- 10253_idx5_x1351_y1101_class0.png
44
- ```
45
 
46
- - "u" is patient id
47
- - "u" is the patient ID (10253_idx5),
48
- - "X" is the x-coordinate of where this patch was cropped from,
49
- - "Y" is the y-coordinate of where this patch was cropped from, and
50
- - "C" indicates the class where 0 is non-IDC and 1 is IDC.
51
 
52
- ## Images
53
 
54
- The following set of images are generated in this project.
55
-
56
- **Normal Tissues**
57
-
58
- The image below displays a collection of 49 randomly selected **IDC negative** image patches, i.e., normal tissues. Each image is labeled with the respective "patient id" at the top.
59
-
60
- <img src="https://huggingface.co/MUmairAB/Breast_Cancer_Detector/resolve/main/Images/Random%20samples%20of%20healthy%20tissues.png" style="height: 890px; width:794px;"/>
61
-
62
- **Cancer Tissues**
63
-
64
- Similarly, The provided image exhibits a set of 49 randomly chosen **IDC positive** image patches, which correspond to cancer tissues. Each image in the collection is accompanied by the corresponding "patient id" label positioned at the top.
65
-
66
- <img src="https://huggingface.co/MUmairAB/Breast_Cancer_Detector/resolve/main/Images/Random%20samples%20of%20cancer%20tissues.png" style="height: 890px; width:794px;"/>
67
-
68
- **Complete Histopathological image of breast**
69
-
70
- Presented below is the comprehensive **Histopathological Image**, revealing the entire breast tissue. This image has been formed by merging all the patches from the patient. Furthermore, a mask has been employed to accentuate the cancerous tissues, which are distinctly marked in **green** color.
71
-
72
- <img src="https://huggingface.co/MUmairAB/Breast_Cancer_Detector/resolve/main/Images/Complete%20Histopathological%20image%20of%20breast.png" style="height: 515px; width: 1001px;"/>
73
-
74
-
75
-
76
-
77
- # Conclusion
78
-
79
- The test data evaluation of the model yielded exceptional results, with a test accuracy of **87%**. This outcome is particularly impressive considering the small size of the dataset and the fact that the image patches were only 50x50 pixels, and the model was trained from scratch.
80
-
81
- Furthermore, we discovered several key insights during the analysis:
82
-
83
- In traditional Convolutional Neural Network (CNN) architectures, the number of filters typically increases progressively, following a pattern like 64, 128, and 256.
84
-
85
- However, when we followed this conventional approach with our 50x50 input images, the accuracy on the test data was only **81%**, and the validation graph showed significant fluctuations. The reason being that we lost neary half of the information after the first layer of CNN. Because the image dimention was reduced to 23x23.
86
-
87
- By adopting a different filter configuration, specifically using 256 filters in each layer, we not only achieved a **6%** increase in accuracy but also observed more stable fluctuations in the validation graph. This indicates that the model consistently generated accurate predictions.
 
1
  ---
2
+ library_name: keras
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
+ ## Model description
6
 
7
+ More information needed
8
 
9
+ ## Intended uses & limitations
10
 
11
+ More information needed
12
 
13
+ ## Training and evaluation data
14
 
15
+ More information needed
16
 
17
+ ## Training procedure
18
 
19
+ ### Training hyperparameters
 
20
 
21
+ The following hyperparameters were used during training:
22
 
23
+ | Hyperparameters | Value |
24
+ | :-- | :-- |
25
+ | name | RMSprop |
26
+ | weight_decay | None |
27
+ | clipnorm | None |
28
+ | global_clipnorm | None |
29
+ | clipvalue | None |
30
+ | use_ema | False |
31
+ | ema_momentum | 0.99 |
32
+ | ema_overwrite_frequency | 100 |
33
+ | jit_compile | False |
34
+ | is_legacy_optimizer | False |
35
+ | learning_rate | 0.0010000000474974513 |
36
+ | rho | 0.9 |
37
+ | momentum | 0.0 |
38
+ | epsilon | 1e-07 |
39
+ | centered | False |
40
+ | training_precision | float32 |
41
 
 
 
 
42
 
43
+ ## Model Plot
 
 
 
 
44
 
45
+ <details>
46
+ <summary>View Model Plot</summary>
 
 
 
47
 
48
+ ![Model Image](./model.png)
49
 
50
+ </details>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fingerprint.pb ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e09376d39f7303ebe96b37e11115408151eb75094e97faa63c13f55061886ce
3
+ size 56
keras_metadata.pb ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1e12806250158e18da1bd213f23b046adfb349c1dcf682c1128e03b4cbe18542
3
+ size 52132
model.png ADDED
saved_model.pb CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:89c5f8ceef527b8be13cc78ccbe51e1c69f3bde2179526463ea8b82efbf22851
3
- size 628425
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c9020f0ec6a819f1037a911bde66768711fd5e8995d9f13d7d62eb4de1c5d73a
3
+ size 584561
variables/variables.data-00000-of-00001 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:16ecf55eae368b9c5858ec93d4847c0fe08bd518479424bc68414bf1edb3145b
3
- size 31599518
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d7e97c60918ab8020b3647fa6e2d9446cb12d1cf26b10aa5dad90aaf698b067
3
+ size 15812988
variables/variables.index CHANGED
Binary files a/variables/variables.index and b/variables/variables.index differ