Spaces:
Build error
Build error
taquynhnga
commited on
Commit
•
1ce9f56
1
Parent(s):
ac420fc
Update Home.py
Browse files
Home.py
CHANGED
@@ -1,6 +1,7 @@
|
|
1 |
import streamlit as st
|
2 |
|
3 |
st.set_page_config(layout='wide')
|
|
|
4 |
|
5 |
st.title('About')
|
6 |
|
@@ -11,10 +12,14 @@ In the year 2020, Vision Transformers (ViT) was introduced as a Transformer mode
|
|
11 |
Larger model and dataset sizes allow ViT to perform significantly better than ResNet, however, ViT still encountered challenges in generic computer vision tasks such as object detection and semantic segmentation.
|
12 |
Swin Transformer’ s success made Transformers be adopted as a generic vision backbone and showed outstanding performance in a wide range of computer vision tasks.
|
13 |
Nevertheless, rather than the intrinsic inductive biases of convolutions, the success of this approach is still primarily attributed to Transformers’ inherent superiority.
|
|
|
14 |
In 2022, Zhuang Liu et. al. proposed a pure convolutional model dubbed ConvNeXt, discovered from the modernization of a standard ResNet towards the design of Vision Transformers and claimed to outperform them.
|
15 |
|
16 |
The project aims to interpret the ConvNeXt model by several visualization techniques.
|
17 |
-
After that, a web interface would be built to demonstrate the interpretations, helping us look inside the deep ConvNeXt model and answer the questions
|
|
|
|
|
|
|
18 |
Due to the limitation in time and resources, the project only used the tiny-sized ConvNeXt model, which was trained on ImageNet-1k at resolution 224x224 and used 50,000 images in validation set of ImageNet-1k for demo purpose.
|
19 |
|
20 |
In this web app, two visualization techniques were implemented and demonstrated, they are **Maximally activating patches** and **SmoothGrad**.
|
@@ -25,9 +30,11 @@ st.write(intro_text)
|
|
25 |
|
26 |
# 4 PAGES
|
27 |
sections_text = """Overall, there are 4 functionalities in this web app:
|
28 |
-
1)
|
29 |
-
2)
|
30 |
-
3)
|
31 |
-
4)
|
32 |
"""
|
33 |
-
st.write(sections_text)
|
|
|
|
|
|
1 |
import streamlit as st
|
2 |
|
3 |
st.set_page_config(layout='wide')
|
4 |
+
# st.set_page_config(layout='centered')
|
5 |
|
6 |
st.title('About')
|
7 |
|
|
|
12 |
Larger model and dataset sizes allow ViT to perform significantly better than ResNet, however, ViT still encountered challenges in generic computer vision tasks such as object detection and semantic segmentation.
|
13 |
Swin Transformer’ s success made Transformers be adopted as a generic vision backbone and showed outstanding performance in a wide range of computer vision tasks.
|
14 |
Nevertheless, rather than the intrinsic inductive biases of convolutions, the success of this approach is still primarily attributed to Transformers’ inherent superiority.
|
15 |
+
|
16 |
In 2022, Zhuang Liu et. al. proposed a pure convolutional model dubbed ConvNeXt, discovered from the modernization of a standard ResNet towards the design of Vision Transformers and claimed to outperform them.
|
17 |
|
18 |
The project aims to interpret the ConvNeXt model by several visualization techniques.
|
19 |
+
After that, a web interface would be built to demonstrate the interpretations, helping us look inside the deep ConvNeXt model and answer the questions:
|
20 |
+
> “What patterns maximally activated this filter (channel) in this layer?”\n
|
21 |
+
> “Which features are responsible for the current prediction?”.
|
22 |
+
|
23 |
Due to the limitation in time and resources, the project only used the tiny-sized ConvNeXt model, which was trained on ImageNet-1k at resolution 224x224 and used 50,000 images in validation set of ImageNet-1k for demo purpose.
|
24 |
|
25 |
In this web app, two visualization techniques were implemented and demonstrated, they are **Maximally activating patches** and **SmoothGrad**.
|
|
|
30 |
|
31 |
# 4 PAGES
|
32 |
sections_text = """Overall, there are 4 functionalities in this web app:
|
33 |
+
1) Maximally activating patches: The visualization method in this page answers the question “what patterns maximally activated this filter (channel)?”.
|
34 |
+
2) SmoothGrad: This visualization method in this page answers the question “which features are responsible for the current prediction?”.
|
35 |
+
3) Adversarial attack: How adversarial attacks affect ConvNeXt interpretation?
|
36 |
+
4) ImageNet1k: The storage of 50,000 images in validation set.
|
37 |
"""
|
38 |
+
st.write(sections_text)
|
39 |
+
|
40 |
+
|