Spaces:

taquynhnga
/

CNNs-interpretation-visualization

Build error

App Files Files Community

taquynhnga commited on Mar 1, 2023

Commit

1ce9f56

1 Parent(s): ac420fc

Update Home.py

Browse files

Files changed (1) hide show

Home.py +13 -6

Home.py CHANGED Viewed

@@ -1,6 +1,7 @@
 import streamlit as st
 st.set_page_config(layout='wide')
 st.title('About')
@@ -11,10 +12,14 @@ In the year 2020, Vision Transformers (ViT) was introduced as a Transformer mode
 Larger model and dataset sizes allow ViT to perform significantly better than ResNet, however, ViT still encountered challenges in generic computer vision tasks such as object detection and semantic segmentation.
 Swin Transformer’ s success made Transformers be adopted as a generic vision backbone and showed outstanding performance in a wide range of computer vision tasks.
 Nevertheless, rather than the intrinsic inductive biases of convolutions, the success of this approach is still primarily attributed to Transformers’ inherent superiority.
 In 2022, Zhuang Liu et. al. proposed a pure convolutional model dubbed ConvNeXt, discovered from the modernization of a standard ResNet towards the design of Vision Transformers and claimed to outperform them.
 The project aims to interpret the ConvNeXt model by several visualization techniques.
-After that, a web interface would be built to demonstrate the interpretations, helping us look inside the deep ConvNeXt model and answer the questions of “what patterns maximally activated this filter (channel) in this layer?”, “which features are responsible for the current prediction?”.
 Due to the limitation in time and resources, the project only used the tiny-sized ConvNeXt model, which was trained on ImageNet-1k at resolution 224x224 and used 50,000 images in validation set of ImageNet-1k for demo purpose.
 In this web app, two visualization techniques were implemented and demonstrated, they are **Maximally activating patches** and **SmoothGrad**.
@@ -25,9 +30,11 @@ st.write(intro_text)
 # 4 PAGES
 sections_text = """Overall, there are 4 functionalities in this web app:
-1) [Maximally activating patches](/Maximally_activating_patches): The visualization method in this page answers the question “what patterns maximally activated this filter (channel)?”.
-2) [SmoothGrad](/SmoothGrad): This visualization method in this page answers the question “which features are responsible for the current prediction?”.
-3) [Adversarial attack](/Adversarial_attack): How adversarial attacks affect ConvNeXt interpretation?
-4) [ImageNet1k](/ImageNet1k): The storage of 50,000 images in validation set.
 """
-st.write(sections_text)

 import streamlit as st
 st.set_page_config(layout='wide')
+# st.set_page_config(layout='centered')
 st.title('About')
 Larger model and dataset sizes allow ViT to perform significantly better than ResNet, however, ViT still encountered challenges in generic computer vision tasks such as object detection and semantic segmentation.
 Swin Transformer’ s success made Transformers be adopted as a generic vision backbone and showed outstanding performance in a wide range of computer vision tasks.
 Nevertheless, rather than the intrinsic inductive biases of convolutions, the success of this approach is still primarily attributed to Transformers’ inherent superiority.
 In 2022, Zhuang Liu et. al. proposed a pure convolutional model dubbed ConvNeXt, discovered from the modernization of a standard ResNet towards the design of Vision Transformers and claimed to outperform them.
 The project aims to interpret the ConvNeXt model by several visualization techniques.
+After that, a web interface would be built to demonstrate the interpretations, helping us look inside the deep ConvNeXt model and answer the questions:
+> “What patterns maximally activated this filter (channel) in this layer?”\n
+> “Which features are responsible for the current prediction?”.
 Due to the limitation in time and resources, the project only used the tiny-sized ConvNeXt model, which was trained on ImageNet-1k at resolution 224x224 and used 50,000 images in validation set of ImageNet-1k for demo purpose.
 In this web app, two visualization techniques were implemented and demonstrated, they are **Maximally activating patches** and **SmoothGrad**.
 # 4 PAGES
 sections_text = """Overall, there are 4 functionalities in this web app:
+1) Maximally activating patches: The visualization method in this page answers the question “what patterns maximally activated this filter (channel)?”.
+2) SmoothGrad: This visualization method in this page answers the question “which features are responsible for the current prediction?”.
+3) Adversarial attack: How adversarial attacks affect ConvNeXt interpretation?
+4) ImageNet1k: The storage of 50,000 images in validation set.
 """
+st.write(sections_text)