Bill Psomas commited on
Commit
13e2dbd
1 Parent(s): 6a2c6d4

first model commit

Browse files
Files changed (2) hide show
  1. README.md +12 -2
  2. configs.yaml +45 -0
README.md CHANGED
@@ -15,9 +15,10 @@ tags:
15
  - deep learning
16
  ---
17
 
18
- # Self-supervised ViT-S/16 (small-sized Vision Transformer with patch size 16) model with SimPool.
 
 
19
 
20
- ViT-S model with SimPool (no gamma) trained on ImageNet-1k for 100 epochs. Self-supervision with DINO.
21
  SimPool is a simple attention-based pooling method at the end of network, introduced on this ICCV 2023 [paper](https://arxiv.org/pdf/2309.06891.pdf) and released in this [repository](https://github.com/billpsomas/simpool/).
22
  Disclaimer: This model card is written by the author of SimPool, i.e. [Bill Psomas](http://users.ntua.gr/psomasbill/).
23
 
@@ -32,6 +33,15 @@ SimPool is a simple attention-based pooling mechanism as a replacement of the de
32
  Interestingly, we find that, whether supervised or self-supervised, SimPool improves performance on pre-training and downstream tasks and provides attention maps delineating object boundaries in all cases.
33
  One could thus call SimPool universal.
34
 
 
 
 
 
 
 
 
 
 
35
  ## BibTeX entry and citation info
36
 
37
  ```
 
15
  - deep learning
16
  ---
17
 
18
+ # Self-supervised ViT-S/16 (small-sized Vision Transformer with patch size 16) model with SimPool
19
+
20
+ ViT-S model with SimPool (no gamma) trained on ImageNet-1k for 100 epochs. Self-supervision with [DINO](https://arxiv.org/abs/2104.14294).
21
 
 
22
  SimPool is a simple attention-based pooling method at the end of network, introduced on this ICCV 2023 [paper](https://arxiv.org/pdf/2309.06891.pdf) and released in this [repository](https://github.com/billpsomas/simpool/).
23
  Disclaimer: This model card is written by the author of SimPool, i.e. [Bill Psomas](http://users.ntua.gr/psomasbill/).
24
 
 
33
  Interestingly, we find that, whether supervised or self-supervised, SimPool improves performance on pre-training and downstream tasks and provides attention maps delineating object boundaries in all cases.
34
  One could thus call SimPool universal.
35
 
36
+ ## Evaluation with k-NN
37
+
38
+ | k | top1 | top5 |
39
+ | ------- | ------- | ------- |
40
+ | 10 | 69.778 | 85.91 |
41
+ | 20 | 69.602 | 87.54 |
42
+ | 100 | 67.318 | 88.674 |
43
+ | 200 | 65.966 | 88.404 |
44
+
45
  ## BibTeX entry and citation info
46
 
47
  ```
configs.yaml ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ arch: vit_small
2
+ backend: nccl
3
+ batch_size_per_gpu: 100
4
+ clip_grad: 0.0
5
+ data_path: /path/to/imagenet/
6
+ dist_url: env://
7
+ drop_path_rate: 0.1
8
+ epochs: 100
9
+ eval_every: 30
10
+ freeze_last_layer: 1
11
+ global_crops_scale:
12
+ - 0.25
13
+ - 1.0
14
+ local_crops_number: 6
15
+ local_crops_scale:
16
+ - 0.05
17
+ - 0.25
18
+ local_rank: 0
19
+ lr: 0.0005
20
+ min_lr: 1.0e-05
21
+ mode: simpool
22
+ momentum_teacher: 0.996
23
+ nb_knn:
24
+ - 10
25
+ - 20
26
+ - 100
27
+ - 200
28
+ norm_last_layer: false
29
+ num_workers: 10
30
+ optimizer: adamw
31
+ out_dim: 65536
32
+ output_dir: /path/to/output/
33
+ patch_size: 16
34
+ saveckp_freq: 20
35
+ seed: 0
36
+ subset: -1
37
+ teacher_temp: 0.07
38
+ temperature: 0.07
39
+ use_bn_in_head: false
40
+ use_fp16: false
41
+ warmup_epochs: 10
42
+ warmup_teacher_temp: 0.04
43
+ warmup_teacher_temp_epochs: 30
44
+ weight_decay: 0.04
45
+ weight_decay_end: 0.4