taquynhnga's picture
Update Home.py
1ce9f56
raw
history blame
2.95 kB
import streamlit as st
st.set_page_config(layout='wide')
# st.set_page_config(layout='centered')
st.title('About')
# INTRO
intro_text = """Convolutional neural networks (ConvNets) have evolved at a rapid speed from the 2010s.
Some of the representative ConvNets models are VGGNet, Inceptions, ResNe(X)t, DenseNet, MobileNet, EfficientNet and RegNet, which focus on various factors of accuracy, efficiency, and scalability.
In the year 2020, Vision Transformers (ViT) was introduced as a Transformer model solving the computer vision problems.
Larger model and dataset sizes allow ViT to perform significantly better than ResNet, however, ViT still encountered challenges in generic computer vision tasks such as object detection and semantic segmentation.
Swin Transformer’ s success made Transformers be adopted as a generic vision backbone and showed outstanding performance in a wide range of computer vision tasks.
Nevertheless, rather than the intrinsic inductive biases of convolutions, the success of this approach is still primarily attributed to Transformers’ inherent superiority.
In 2022, Zhuang Liu et. al. proposed a pure convolutional model dubbed ConvNeXt, discovered from the modernization of a standard ResNet towards the design of Vision Transformers and claimed to outperform them.
The project aims to interpret the ConvNeXt model by several visualization techniques.
After that, a web interface would be built to demonstrate the interpretations, helping us look inside the deep ConvNeXt model and answer the questions:
> “What patterns maximally activated this filter (channel) in this layer?”\n
> “Which features are responsible for the current prediction?”.
Due to the limitation in time and resources, the project only used the tiny-sized ConvNeXt model, which was trained on ImageNet-1k at resolution 224x224 and used 50,000 images in validation set of ImageNet-1k for demo purpose.
In this web app, two visualization techniques were implemented and demonstrated, they are **Maximally activating patches** and **SmoothGrad**.
Besides, this web app also helps investigate the effect of **adversarial attacks** on ConvNeXt interpretations.
Last but not least, there is a last webpage that stores 50,000 images in the **ImageNet-1k** validation set, facilitating the two web pages above in searching and referencing.
"""
st.write(intro_text)
# 4 PAGES
sections_text = """Overall, there are 4 functionalities in this web app:
1) Maximally activating patches: The visualization method in this page answers the question “what patterns maximally activated this filter (channel)?”.
2) SmoothGrad: This visualization method in this page answers the question “which features are responsible for the current prediction?”.
3) Adversarial attack: How adversarial attacks affect ConvNeXt interpretation?
4) ImageNet1k: The storage of 50,000 images in validation set.
"""
st.write(sections_text)