Spaces:
Build error
Build error
import streamlit as st | |
st.set_page_config(layout='wide') | |
# st.set_page_config(layout='centered') | |
st.title('About') | |
# INTRO | |
intro_text = """Convolutional neural networks (ConvNets) have evolved at a rapid speed from the 2010s. | |
Some of the representative ConvNets models are VGGNet, Inceptions, ResNe(X)t, DenseNet, MobileNet, EfficientNet and RegNet, which focus on various factors of accuracy, efficiency, and scalability. | |
In the year 2020, Vision Transformers (ViT) was introduced as a Transformer model solving the computer vision problems. | |
Larger model and dataset sizes allow ViT to perform significantly better than ResNet, however, ViT still encountered challenges in generic computer vision tasks such as object detection and semantic segmentation. | |
Swin Transformer’ s success made Transformers be adopted as a generic vision backbone and showed outstanding performance in a wide range of computer vision tasks. | |
Nevertheless, rather than the intrinsic inductive biases of convolutions, the success of this approach is still primarily attributed to Transformers’ inherent superiority. | |
In 2022, Zhuang Liu et. al. proposed a pure convolutional model dubbed ConvNeXt, discovered from the modernization of a standard ResNet towards the design of Vision Transformers and claimed to outperform them. | |
The project aims to interpret the ConvNeXt model by several visualization techniques. | |
After that, a web interface would be built to demonstrate the interpretations, helping us look inside the deep ConvNeXt model and answer the questions: | |
> “What patterns maximally activated this filter (channel) in this layer?”\n | |
> “Which features are responsible for the current prediction?”. | |
Due to the limitation in time and resources, the project only used the tiny-sized ConvNeXt model, which was trained on ImageNet-1k at resolution 224x224 and used 50,000 images in validation set of ImageNet-1k for demo purpose. | |
In this web app, two visualization techniques were implemented and demonstrated, they are **Maximally activating patches** and **SmoothGrad**. | |
Besides, this web app also helps investigate the effect of **adversarial attacks** on ConvNeXt interpretations. | |
Last but not least, there is a last webpage that stores 50,000 images in the **ImageNet-1k** validation set, facilitating the two web pages above in searching and referencing. | |
""" | |
st.write(intro_text) | |
# 4 PAGES | |
sections_text = """Overall, there are 4 functionalities in this web app: | |
1) Maximally activating patches: The visualization method in this page answers the question “what patterns maximally activated this filter (channel)?”. | |
2) SmoothGrad: This visualization method in this page answers the question “which features are responsible for the current prediction?”. | |
3) Adversarial attack: How adversarial attacks affect ConvNeXt interpretation? | |
4) ImageNet1k: The storage of 50,000 images in validation set. | |
""" | |
st.write(sections_text) | |