zhoupans
/

Mugs

Model card Files Files and versions Community

zhoupans commited on Mar 14, 2023

Commit

bd4cb44

•

1 Parent(s): 585a879

Update README.md

Browse files

Files changed (1) hide show

README.md +11 -5

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ This is a PyTorch implementation of **Mugs** proposed by our paper "**Mugs: A Mu
 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mugs-a-multi-granular-self-supervised/self-supervised-image-classification-on)](https://paperswithcode.com/sota/self-supervised-image-classification-on?p=mugs-a-multi-granular-self-supervised)
 <div align="center">
-<img width="100%" alt="Overall framework of Mugs. " src="https://huggingface.co/zhoupans/Mugs_ViT_large_pretrained/resolve/main/exp_illustration/framework.png">
 </div>
 **<p align="center">Fig 1. Overall framework of Mugs.** In (a), for each image, two random crops of one image
@@ -93,9 +93,10 @@ You can choose to download only the weights of the pretrained backbone used for
 </table>
 <div align="center">
-  <img width="100%" alt="Comparison of linear probing accuracy on ImageNet-1K." src="https://huggingface.co/zhoupans/Mugs_ViT_large_pretrained/blob/main/exp_illustration/comparison.png">
 </div>
 **<p align="center">Fig 2. Comparison of linear probing accuracy on ImageNet-1K.**</p>
 ## Pretraining Settings
@@ -149,9 +150,10 @@ We are cleaning up the evalutation code and will release them when they are read
 ## Self-attention visualization
 Here we provide the self-attention map of the [CLS] token on the heads of the last layer
 <div align="center">
-  <img width="100%" alt="Self-attention from a ViT-Base/16 trained with Mugs" src="./exp_illustration/attention_vis.png">
 </div>
 **<p align="center">Fig 3. Self-attention from a ViT-Base/16 trained with Mugs.**</p>
@@ -160,10 +162,14 @@ Here we provide the T-SNE visualization of the learned feature by ViT-B/16.
 We show the fish classes in ImageNet-1K, i.e., the first six classes,
 including tench, goldfish, white shark, tiger shark, hammerhead, electric
 ray. See more examples in Appendix.
-<div align="center">
-  <img width="100%" alt="T-SNE visualization of the learned feature by ViT-B/16." src="https://huggingface.co/zhoupans/Mugs_ViT_large_pretrained/blob/main/exp_illustration/attention_vis.png">
 </div>
 **<p align="center">Fig 4. T-SNE visualization of the learned feature by ViT-B/16.**</p>
 ## License

 [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/mugs-a-multi-granular-self-supervised/self-supervised-image-classification-on)](https://paperswithcode.com/sota/self-supervised-image-classification-on?p=mugs-a-multi-granular-self-supervised)
 <div align="center">
+<img width="75%" alt="Overall framework of Mugs. " src="https://huggingface.co/zhoupans/Mugs_ViT_large_pretrained/resolve/main/exp_illustration/framework.png">
 </div>
 **<p align="center">Fig 1. Overall framework of Mugs.** In (a), for each image, two random crops of one image
 </table>
 <div align="center">
+  <img width="75%" alt="Comparison of linear probing accuracy on ImageNet-1K." src="https://huggingface.co/zhoupans/Mugs_ViT_large_pretrained/resolve/main/exp_illustration/comparison.png">
 </div>
 **<p align="center">Fig 2. Comparison of linear probing accuracy on ImageNet-1K.**</p>
 ## Pretraining Settings
 ## Self-attention visualization
 Here we provide the self-attention map of the [CLS] token on the heads of the last layer
 <div align="center">
+  <img width="75%" alt="Self-attention from a ViT-Base/16 trained with Mugs" src="https://huggingface.co/zhoupans/Mugs_ViT_large_pretrained/resolve/main/exp_illustration/attention_vis.png">
 </div>
 **<p align="center">Fig 3. Self-attention from a ViT-Base/16 trained with Mugs.**</p>
 We show the fish classes in ImageNet-1K, i.e., the first six classes,
 including tench, goldfish, white shark, tiger shark, hammerhead, electric
 ray. See more examples in Appendix.
+<div align="center">
+<img width="90%" alt="T-SNE visualization of the learned feature by ViT-B/16." src="https://huggingface.co/zhoupans/Mugs_ViT_large_pretrained/resolve/main/exp_illustration/TSNE.png">
 </div>
 **<p align="center">Fig 4. T-SNE visualization of the learned feature by ViT-B/16.**</p>
 ## License