|
import streamlit as st |
|
from frontend.footer import add_footer |
|
|
|
st.set_page_config(layout='wide') |
|
|
|
|
|
st.title('About') |
|
|
|
|
|
intro_text = """Convolutional neural networks (ConvNets) have evolved at a rapid speed from the 2010s. |
|
Some of the representative ConvNets models are VGGNet, Inceptions, ResNe(X)t, DenseNet, MobileNet, EfficientNet and RegNet, which focus on various factors of accuracy, efficiency, and scalability. |
|
In the year 2020, Vision Transformers (ViT) was introduced as a Transformer model solving the computer vision problems. |
|
Larger model and dataset sizes allow ViT to perform significantly better than ResNet, however, ViT still encountered challenges in generic computer vision tasks such as object detection and semantic segmentation. |
|
Swin Transformer’ s success made Transformers be adopted as a generic vision backbone and showed outstanding performance in a wide range of computer vision tasks. |
|
Nevertheless, rather than the intrinsic inductive biases of convolutions, the success of this approach is still primarily attributed to Transformers’ inherent superiority. |
|
|
|
In 2022, Zhuang Liu et. al. proposed a pure convolutional model dubbed ConvNeXt, discovered from the modernization of a standard ResNet towards the design of Vision Transformers and claimed to outperform them. |
|
|
|
The project aims to interpret the ConvNeXt model by several visualization techniques. |
|
After that, a web interface would be built to demonstrate the interpretations, helping us look inside the deep ConvNeXt model and answer the questions: |
|
> “What patterns maximally activated this filter (channel) in this layer?”\n |
|
> “Which features are responsible for the current prediction?”. |
|
|
|
Due to the limitation in time and resources, the project only used the tiny-sized ConvNeXt model, which was trained on ImageNet-1k at resolution 224x224 and used 50,000 images in validation set of ImageNet-1k for demo purpose. |
|
|
|
In this web app, two visualization techniques were implemented and demonstrated, they are **Maximally activating patches** and **SmoothGrad**. |
|
Besides, this web app also helps investigate the effect of **adversarial attacks** on ConvNeXt interpretations. |
|
Last but not least, there is a last webpage that stores 50,000 images in the **ImageNet-1k** validation set, facilitating the two web pages above in searching and referencing. |
|
""" |
|
st.write(intro_text) |
|
|
|
|
|
st.subheader('Features') |
|
sections_text = """Overall, there are 4 features in this web app: |
|
1) Maximally activating patches: The visualization method in this page answers the question “what patterns maximally activated this filter (channel)?”. |
|
2) SmoothGrad: This visualization method in this page answers the question “which features are responsible for the current prediction?”. |
|
3) Adversarial attack: How adversarial attacks affect ConvNeXt interpretation? |
|
4) ImageNet1k: The storage of 50,000 images in validation set. |
|
""" |
|
st.write(sections_text) |
|
|
|
|
|
add_footer('Developed with ❤ by ', 'Hanna Ta Quynh Nga', 'https://www.linkedin.com/in/ta-quynh-nga-hanna/') |
|
|