IamYash
/

ImageNet-Base-AttentionOnly

Model card Files Files and versions Community

ImageNet-Base-AttentionOnly / README.md

IamYash's picture

Create README.md

98e27f1 verified 10 months ago

|

history blame contribute delete

1.94 kB


	## ImageNet Results
	In our ImageNet experiment, we aimed to assess the performance of Mice ViTs on a more complex and diverse dataset, ImageNet. We trained mice ViTs on the classifying the 1000 ImageNet classes.

	## Training Details
	Similar to the dSprites experiment, for each attention layer setting, we explored two model variants: an attention-only model and a model combining attention with the MLP module. Dropout and layer normalization were not applied for simplicity. The detailed training logs and metrics can be found [here](https://wandb.ai/vit-prisma/Imagenet/overview?workspace=user-yash-vadi).

	## Table of Results
	Below table describe the accuracy `[ <Acc> \| <Top5 Acc> ]` of Mice ViTs with different configuration.
	\| Size \| NumLayers \| Attention+MLP \| AttentionOnly \| Model Link \|
	\|:--------:\|:-------------:\|:-----------------:\|:-----------------:\|--------------------------------------------\|
	\| tiny \| 1 \| 0.16 \\| 0.33 \| 0.11 \\| 0.25 \| [AttentionOnly](https://huggingface.co/IamYash/ImageNet-tiny-AttentionOnly), [Attention+MLP](https://huggingface.co/IamYash/ImageNet-tiny-Attention-and-MLP) \|
	\| base \| 2 \| 0.23 \\| 0.44 \| 0.16 \\| 0.34 \| [AttentionOnly](https://huggingface.co/IamYash/ImageNet-base-AttentionOnly), [Attention+MLP](https://huggingface.co/IamYash/ImageNet-base-Attention-and-MLP) \|
	\| small\| 3 \| 0.28 \\| 0.51 \| 0.17 \\| 0.35 \| [AttentionOnly](https://huggingface.co/IamYash/ImageNet-small-AttentionOnly), [Attention+MLP](https://huggingface.co/IamYash/ImageNet-small-Attention-and-MLP) \|
	\| medium\|4 \| 0.33 \\| 0.56 \| 0.17 \\| 0.36 \| [AttentionOnly](https://huggingface.co/IamYash/ImageNet-medium-AttentionOnly), [Attention+MLP](https://huggingface.co/IamYash/ImageNet-medium-Attention-and-MLP) \|