Bill Psomas commited on
Commit
a279524
1 Parent(s): fcba0d3

update readme

Browse files
Files changed (4) hide show
  1. README.md +64 -0
  2. checkpoint.pth +3 -0
  3. configs.yaml +45 -0
  4. log.txt +100 -0
README.md CHANGED
@@ -1,3 +1,67 @@
1
  ---
2
  license: cc-by-4.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ datasets:
4
+ - imagenet-1k
5
+ metrics:
6
+ - accuracy
7
+ pipeline_tag: image-classification
8
+ language:
9
+ - en
10
+ tags:
11
+ - convnext
12
+ - convolutional neural network
13
+ - simpool
14
+ - dino
15
+ - computer vision
16
+ - deep learning
17
  ---
18
+
19
+ # Self-supervised ConvNeXt-S model with SimPool
20
+
21
+ ConvNeXt-S model with SimPool (gamma=2.0) trained on ImageNet-1k for 100 epochs. Self-supervision with [DINO](https://arxiv.org/abs/2104.14294).
22
+
23
+ SimPool is a simple attention-based pooling method at the end of network, introduced on this ICCV 2023 [paper](https://arxiv.org/pdf/2309.06891.pdf) and released in this [repository](https://github.com/billpsomas/simpool/).
24
+ Disclaimer: This model card is written by the author of SimPool, i.e. [Bill Psomas](http://users.ntua.gr/psomasbill/).
25
+
26
+ ## Motivation
27
+
28
+ Convolutional networks and vision transformers have different forms of pairwise interactions, pooling across layers and pooling at the end of the network. Does the latter really need to be different?
29
+ As a by-product of pooling, vision transformers provide spatial attention for free, but this is most often of low quality unless self-supervised, which is not well studied. Is supervision really the problem?
30
+
31
+ ## Method
32
+
33
+ SimPool is a simple attention-based pooling mechanism as a replacement of the default one for both convolutional and transformer encoders. For transformers, we completely discard the [CLS] token.
34
+ Interestingly, we find that, whether supervised or self-supervised, SimPool improves performance on pre-training and downstream tasks and provides attention maps delineating object boundaries in all cases.
35
+ One could thus call SimPool universal.
36
+
37
+ ## Evaluation with k-NN
38
+
39
+ | k | top1 | top5 |
40
+ | ------- | ------- | ------- |
41
+ | 10 | 68.66 | 85.832 |
42
+ | 20 | 68.508 | 87.636 |
43
+ | 100 | 66.33 | 88.876 |
44
+ | 200 | 64.93 | 88.53 |
45
+
46
+ ## BibTeX entry and citation info
47
+
48
+ ```
49
+ @misc{psomas2023simpool,
50
+ title={Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?},
51
+ author={Bill Psomas and Ioannis Kakogeorgiou and Konstantinos Karantzalos and Yannis Avrithis},
52
+ year={2023},
53
+ eprint={2309.06891},
54
+ archivePrefix={arXiv},
55
+ primaryClass={cs.CV}
56
+ }
57
+ ```
58
+
59
+ ```
60
+ @inproceedings{liu2022convnet,
61
+ title={A convnet for the 2020s},
62
+ author={Liu, Zhuang and Mao, Hanzi and Wu, Chao-Yuan and Feichtenhofer, Christoph and Darrell, Trevor and Xie, Saining},
63
+ booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
64
+ pages={11976--11986},
65
+ year={2022}
66
+ }
67
+ ```
checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8219ad4632b4b10ea448aa8d10d66dca42e09791713049b0bdf7cf5326665914
3
+ size 1180632750
configs.yaml ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ arch: convnext_small
2
+ backend: nccl
3
+ batch_size_per_gpu: 60
4
+ clip_grad: 0.3
5
+ data_path: /path/to/imagenet/
6
+ dist_url: env://
7
+ drop_path_rate: 0.1
8
+ epochs: 100
9
+ eval_every: 30
10
+ freeze_last_layer: 3
11
+ global_crops_scale: !!python/tuple
12
+ - 0.14
13
+ - 1.0
14
+ local_crops_number: 6
15
+ local_crops_scale: !!python/tuple
16
+ - 0.05
17
+ - 0.14
18
+ local_rank: 0
19
+ lr: 0.001
20
+ min_lr: 2.0e-06
21
+ mode: simpool
22
+ momentum_teacher: 0.996
23
+ nb_knn:
24
+ - 10
25
+ - 20
26
+ - 100
27
+ - 200
28
+ norm_last_layer: true
29
+ num_workers: 10
30
+ optimizer: adamw
31
+ out_dim: 65536
32
+ output_dir: /path/to/output/
33
+ patch_size: 16
34
+ saveckp_freq: 20
35
+ seed: 0
36
+ subset: -1
37
+ teacher_temp: 0.07
38
+ temperature: 0.07
39
+ use_bn_in_head: false
40
+ use_fp16: false
41
+ warmup_epochs: 10
42
+ warmup_teacher_temp: 0.04
43
+ warmup_teacher_temp_epochs: 50
44
+ weight_decay: 0.04
45
+ weight_decay_end: 0.4
log.txt ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {"train_loss": 8.370312581351742, "train_entropy": 6.235790162597447, "train_KL_div": 2.1345224282929363, "train_lr": 9.371838585184911e-05, "train_wd": 0.040029590715150644, "epoch": 0, "k-NN": {"10": {"top1": 1.738, "top5": 4.714}, "20": {"top1": 2.106, "top5": 5.168}, "100": {"top1": 2.538, "top5": 7.108}, "200": {"top1": 2.646, "top5": 7.686}}}
2
+ {"train_loss": 6.324378517983602, "train_entropy": 3.1152541266817777, "train_KL_div": 3.2091244178436455, "train_lr": 0.00028122541121810494, "train_wd": 0.04020716650296683, "epoch": 1}
3
+ {"train_loss": 5.405653928238696, "train_entropy": 1.9404335047487196, "train_KL_div": 3.4652204253937966, "train_lr": 0.0004687324365843606, "train_wd": 0.040562176110793, "epoch": 2}
4
+ {"train_loss": 9.099132882382996, "train_entropy": 8.028683604331286, "train_KL_div": 1.0704492867347475, "train_lr": 0.0006562394619506165, "train_wd": 0.04109426918700657, "epoch": 3}
5
+ {"train_loss": 7.7332632743762915, "train_entropy": 5.794214907658149, "train_KL_div": 1.9390483487072219, "train_lr": 0.0008437464873168719, "train_wd": 0.041802920619982066, "epoch": 4}
6
+ {"train_loss": 6.667616084904741, "train_entropy": 4.207212774516402, "train_KL_div": 2.4604033577549362, "train_lr": 0.001031253512683128, "train_wd": 0.04268743105631148, "epoch": 5}
7
+ {"train_loss": 6.116927636478134, "train_entropy": 3.551711852115475, "train_KL_div": 2.565215792379919, "train_lr": 0.001218760538049384, "train_wd": 0.043746927590982455, "epoch": 6}
8
+ {"train_loss": 5.76854561669588, "train_entropy": 3.205333225588175, "train_KL_div": 2.563212368909497, "train_lr": 0.00140626756341564, "train_wd": 0.04498036462882962, "epoch": 7}
9
+ {"train_loss": 5.52121238886707, "train_entropy": 2.982181281577161, "train_KL_div": 2.5390310946052197, "train_lr": 0.0015937745887818948, "train_wd": 0.04638652491641175, "epoch": 8}
10
+ {"train_loss": 5.340750132061739, "train_entropy": 2.830700676230823, "train_KL_div": 2.5100494581422987, "train_lr": 0.0017812816141481512, "train_wd": 0.047964020743292514, "epoch": 9}
11
+ {"train_loss": 5.194406159945131, "train_entropy": 2.7211264220459626, "train_KL_div": 2.4732797546259873, "train_lr": 0.0018748099356372839, "train_wd": 0.04971129531154394, "epoch": 10, "k-NN": {"10": {"top1": 39.752, "top5": 60.97}, "20": {"top1": 39.856, "top5": 63.62}, "100": {"top1": 37.884, "top5": 65.472}, "200": {"top1": 36.044, "top5": 64.634}}}
12
+ {"train_loss": 5.068999081637896, "train_entropy": 2.6370144456613818, "train_KL_div": 2.4319846436476253, "train_lr": 0.0018736693999551052, "train_wd": 0.05162662427211857, "epoch": 11}
13
+ {"train_loss": 4.979237863726275, "train_entropy": 2.5785220736151286, "train_KL_div": 2.400715798251233, "train_lr": 0.0018713895044108898, "train_wd": 0.05370811742657265, "epoch": 12}
14
+ {"train_loss": 4.911476687216589, "train_entropy": 2.541079262092168, "train_KL_div": 2.370397431801749, "train_lr": 0.0018679730267061533, "train_wd": 0.05595372059246422, "epoch": 13}
15
+ {"train_loss": 4.861703909417135, "train_entropy": 2.5179538136789885, "train_KL_div": 2.34375010151102, "train_lr": 0.0018634241292927246, "train_wd": 0.05836121763058376, "epoch": 14}
16
+ {"train_loss": 4.823163686664241, "train_entropy": 2.5039728351325388, "train_KL_div": 2.3191908542283155, "train_lr": 0.001857748354301394, "train_wd": 0.060928232632015356, "epoch": 15}
17
+ {"train_loss": 4.795157956313265, "train_entropy": 2.4980248783960373, "train_KL_div": 2.2971330875759044, "train_lr": 0.0018509526167897036, "train_wd": 0.06365223226287196, "epoch": 16}
18
+ {"train_loss": 4.773078888286228, "train_entropy": 2.494850558984945, "train_KL_div": 2.2782283280841775, "train_lr": 0.0018430451963169943, "train_wd": 0.06653052826439025, "epoch": 17}
19
+ {"train_loss": 4.758616074299982, "train_entropy": 2.496778675738503, "train_KL_div": 2.2618374026706043, "train_lr": 0.0018340357268570872, "train_wd": 0.0695602801059179, "epoch": 18}
20
+ {"train_loss": 4.751634145517749, "train_entropy": 2.5076569636558586, "train_KL_div": 2.2439771790424508, "train_lr": 0.0018239351850607122, "train_wd": 0.07273849778817326, "epoch": 19}
21
+ {"train_loss": 4.745623217177507, "train_entropy": 2.519962328265963, "train_KL_div": 2.2256608919654726, "train_lr": 0.001812755876882169, "train_wd": 0.07606204479401499, "epoch": 20, "k-NN": {"10": {"top1": 52.864, "top5": 74.14}, "20": {"top1": 52.746, "top5": 76.546}, "100": {"top1": 50.122, "top5": 77.766}, "200": {"top1": 48.15, "top5": 76.844}}}
22
+ {"train_loss": 4.7447938890258845, "train_entropy": 2.534767439667765, "train_KL_div": 2.210026455438061, "train_lr": 0.0018005114225864384, "train_wd": 0.07952764118380551, "epoch": 21}
23
+ {"train_loss": 4.749147252370553, "train_entropy": 2.5554913526932013, "train_KL_div": 2.193655901927321, "train_lr": 0.0017872167401549706, "train_wd": 0.08313186683231381, "epoch": 22}
24
+ {"train_loss": 4.75773145467791, "train_entropy": 2.579687219639314, "train_KL_div": 2.1780442423747246, "train_lr": 0.0017728880271104118, "train_wd": 0.08687116480396875, "epoch": 23}
25
+ {"train_loss": 4.765693075058834, "train_entropy": 2.603541495805378, "train_KL_div": 2.1621515909387807, "train_lr": 0.001757542740782444, "train_wd": 0.09074184486312321, "epoch": 24}
26
+ {"train_loss": 4.7780585554308015, "train_entropy": 2.629245277177354, "train_KL_div": 2.14881329418189, "train_lr": 0.001741199577038695, "train_wd": 0.09474008711587155, "epoch": 25}
27
+ {"train_loss": 4.792970566166256, "train_entropy": 2.657359203447247, "train_KL_div": 2.135611383437708, "train_lr": 0.0017238784475067158, "train_wd": 0.09886194577982399, "epoch": 26}
28
+ {"train_loss": 4.810332554898132, "train_entropy": 2.6904685357372697, "train_KL_div": 2.119864041336738, "train_lr": 0.0017056004553147263, "train_wd": 0.10310335307811701, "epoch": 27}
29
+ {"train_loss": 4.8274960406445615, "train_entropy": 2.72071297419299, "train_KL_div": 2.106783098810931, "train_lr": 0.0016863878693807006, "train_wd": 0.1074601232538191, "epoch": 28}
30
+ {"train_loss": 4.847888565947832, "train_entropy": 2.7559683465497598, "train_KL_div": 2.0919202500099368, "train_lr": 0.0016662640972811295, "train_wd": 0.11192795670076539, "epoch": 29}
31
+ {"train_loss": 4.866915328666127, "train_entropy": 2.7882609334765593, "train_KL_div": 2.07865442801232, "train_lr": 0.0016452536567324889, "train_wd": 0.11650244420675228, "epoch": 30, "k-NN": {"10": {"top1": 57.212, "top5": 77.67}, "20": {"top1": 57.06, "top5": 79.912}, "100": {"top1": 54.526, "top5": 81.152}, "200": {"top1": 52.65, "top5": 80.444}}}
32
+ {"train_loss": 4.889047224899498, "train_entropy": 2.8270666603851784, "train_KL_div": 2.061980589420307, "train_lr": 0.0016233821457201716, "train_wd": 0.12117907130489577, "epoch": 31}
33
+ {"train_loss": 4.91206580299972, "train_entropy": 2.864892378689511, "train_KL_div": 2.047173451946308, "train_lr": 0.001600676211311305, "train_wd": 0.12595322272886586, "epoch": 32}
34
+ {"train_loss": 4.934548416848752, "train_entropy": 2.9070626531771926, "train_KL_div": 2.0274857928541636, "train_lr": 0.0015771635171893791, "train_wd": 0.13082018696759773, "epoch": 33}
35
+ {"train_loss": 4.958427045532003, "train_entropy": 2.9459073564652516, "train_KL_div": 2.012519710103686, "train_lr": 0.0015528727099503327, "train_wd": 0.13577516091498287, "epoch": 34}
36
+ {"train_loss": 4.9835112736259966, "train_entropy": 2.9869284096305493, "train_KL_div": 1.996582885490191, "train_lr": 0.0015278333842010708, "train_wd": 0.1408132546099574, "epoch": 35}
37
+ {"train_loss": 5.0082967611188165, "train_entropy": 3.0312402377909335, "train_KL_div": 1.9770565351527833, "train_lr": 0.0015020760465030082, "train_wd": 0.1459294960623004, "epoch": 36}
38
+ {"train_loss": 5.035227231494314, "train_entropy": 3.073813812714384, "train_KL_div": 1.9614134335024342, "train_lr": 0.001475632078204518, "train_wd": 0.15111883615938876, "epoch": 37}
39
+ {"train_loss": 5.059636245353609, "train_entropy": 3.11374337152177, "train_KL_div": 1.9458928860475526, "train_lr": 0.0014485336972075889, "train_wd": 0.1563761536490648, "epoch": 38}
40
+ {"train_loss": 5.0876227424090255, "train_entropy": 3.1588326332362775, "train_KL_div": 1.9287901154201814, "train_lr": 0.0014208139187152813, "train_wd": 0.16169626019368868, "epoch": 39}
41
+ {"train_loss": 5.117905145662472, "train_entropy": 3.2064868840062695, "train_KL_div": 1.9114182702987639, "train_lr": 0.0013925065150077846, "train_wd": 0.16707390549040632, "epoch": 40, "k-NN": {"10": {"top1": 59.322, "top5": 79.404}, "20": {"top1": 59.328, "top5": 81.506}, "100": {"top1": 56.698, "top5": 82.704}, "200": {"top1": 54.892, "top5": 82.066}}}
42
+ {"train_loss": 5.148122236629513, "train_entropy": 3.2544393307843786, "train_KL_div": 1.893682910964792, "train_lr": 0.0013636459742960914, "train_wd": 0.17250378245255954, "epoch": 41}
43
+ {"train_loss": 5.18105975223329, "train_entropy": 3.308339905307819, "train_KL_div": 1.8727198579687445, "train_lr": 0.0013342674587034358, "train_wd": 0.17798053244714446, "epoch": 42}
44
+ {"train_loss": 5.217292945587755, "train_entropy": 3.363233090876282, "train_KL_div": 1.8540598520595244, "train_lr": 0.001304406761425651, "train_wd": 0.1834987505831314, "epoch": 43}
45
+ {"train_loss": 5.2538503690711025, "train_entropy": 3.4201239676187885, "train_KL_div": 1.8337264075099238, "train_lr": 0.001274100263122686, "train_wd": 0.1890529910454445, "epoch": 44}
46
+ {"train_loss": 5.2917530581762655, "train_entropy": 3.4798126008658876, "train_KL_div": 1.8119404586726424, "train_lr": 0.0012433848875943684, "train_wd": 0.1946377724693181, "epoch": 45}
47
+ {"train_loss": 5.329724394949066, "train_entropy": 3.5373386094837254, "train_KL_div": 1.7923857784474464, "train_lr": 0.00121229805679443, "train_wd": 0.20024758334974535, "epoch": 46}
48
+ {"train_loss": 5.376256227716608, "train_entropy": 3.6046666805665564, "train_KL_div": 1.7715895426668626, "train_lr": 0.001180877645237644, "train_wd": 0.20587688748066807, "epoch": 47}
49
+ {"train_loss": 5.427686794548277, "train_entropy": 3.6800479456117956, "train_KL_div": 1.7476388429012044, "train_lr": 0.0011491619338555533, "train_wd": 0.21152012941854353, "epoch": 48}
50
+ {"train_loss": 5.495041847251336, "train_entropy": 3.7711307713828224, "train_KL_div": 1.723911064908986, "train_lr": 0.0011171895633570396, "train_wd": 0.21717173996489547, "epoch": 49}
51
+ {"train_loss": 5.546321675767609, "train_entropy": 3.8298862473533233, "train_KL_div": 1.7164354330593836, "train_lr": 0.0010849994871505954, "train_wd": 0.22282614166244302, "epoch": 50, "k-NN": {"10": {"top1": 61.08, "top5": 80.518}, "20": {"top1": 60.874, "top5": 82.544}, "100": {"top1": 58.196, "top5": 83.776}, "200": {"top1": 56.68, "top5": 83.244}}}
52
+ {"train_loss": 5.596367934528207, "train_entropy": 3.8900171701114874, "train_KL_div": 1.70635076815178, "train_lr": 0.0010526309238855724, "train_wd": 0.22847775429937295, "epoch": 51}
53
+ {"train_loss": 5.634750574213059, "train_entropy": 3.9377086797359957, "train_KL_div": 1.6970419023212489, "train_lr": 0.0010201233096703167, "train_wd": 0.23412100041634132, "epoch": 52}
54
+ {"train_loss": 5.659682101035842, "train_entropy": 3.9675156920980807, "train_KL_div": 1.6921664106461742, "train_lr": 0.0009875162500253326, "train_wd": 0.23975031081074732, "epoch": 53}
55
+ {"train_loss": 5.6725967760254, "train_entropy": 3.9815986815151896, "train_KL_div": 1.690998106893416, "train_lr": 0.0009548494716300705, "train_wd": 0.24536013003286886, "epoch": 54}
56
+ {"train_loss": 5.682345183010002, "train_entropy": 3.997017678762474, "train_KL_div": 1.6853275152349705, "train_lr": 0.000922162773922069, "train_wd": 0.25094492186841155, "epoch": 55}
57
+ {"train_loss": 5.688484938236693, "train_entropy": 4.005564073769954, "train_KL_div": 1.6829208770676827, "train_lr": 0.0008894959806074867, "train_wd": 0.25649917480209633, "epoch": 56}
58
+ {"train_loss": 5.684495358419758, "train_entropy": 4.004727357914047, "train_KL_div": 1.679768000287972, "train_lr": 0.0008568888911420439, "train_wd": 0.2620174074568511, "epoch": 57}
59
+ {"train_loss": 5.678799920601986, "train_entropy": 4.000419752825151, "train_KL_div": 1.6783801814999728, "train_lr": 0.0008243812322415139, "train_wd": 0.26749417400326775, "epoch": 58}
60
+ {"train_loss": 5.674387078577323, "train_entropy": 3.9999229155840967, "train_KL_div": 1.6744641735954382, "train_lr": 0.0007920126094808529, "train_wd": 0.27292406953398446, "epoch": 59}
61
+ {"train_loss": 5.671711833144717, "train_entropy": 4.002662654023976, "train_KL_div": 1.669049198566505, "train_lr": 0.000759822459040886, "train_wd": 0.2783017353976724, "epoch": 60, "k-NN": {"10": {"top1": 62.806, "top5": 81.72}, "20": {"top1": 62.618, "top5": 83.628}, "100": {"top1": 60.056, "top5": 84.974}, "200": {"top1": 58.572, "top5": 84.432}}}
62
+ {"train_loss": 5.660590121968492, "train_entropy": 3.994437852683877, "train_KL_div": 1.6661522732653302, "train_lr": 0.0007278499996614098, "train_wd": 0.2836218644873766, "epoch": 61}
63
+ {"train_loss": 5.645887769018947, "train_entropy": 3.985993484536911, "train_KL_div": 1.6598942937275674, "train_lr": 0.0006961341848592022, "train_wd": 0.2888792064779921, "epoch": 62}
64
+ {"train_loss": 5.639259246146915, "train_entropy": 3.987984158453221, "train_KL_div": 1.6512751020309675, "train_lr": 0.0006647136554691547, "train_wd": 0.2940685730077003, "epoch": 63}
65
+ {"train_loss": 5.633681323788236, "train_entropy": 3.989522271363554, "train_KL_div": 1.6441590644059088, "train_lr": 0.0006336266925663635, "train_wd": 0.2991848427982596, "epoch": 64}
66
+ {"train_loss": 5.624753459467339, "train_entropy": 3.987066569344834, "train_KL_div": 1.637686892740956, "train_lr": 0.0006029111708265462, "train_wd": 0.3042229667090761, "epoch": 65}
67
+ {"train_loss": 5.616673219601462, "train_entropy": 3.985377643473723, "train_KL_div": 1.631295587941474, "train_lr": 0.00057260451238158, "train_wd": 0.3091779727201141, "epoch": 66}
68
+ {"train_loss": 5.607849497011691, "train_entropy": 3.9831280899923276, "train_KL_div": 1.624721418442285, "train_lr": 0.0005427436412263984, "train_wd": 0.31404497083866506, "epoch": 67}
69
+ {"train_loss": 5.594812982713922, "train_entropy": 3.976891397927396, "train_KL_div": 1.617921590509152, "train_lr": 0.0005133649382327895, "train_wd": 0.31881915792518467, "epoch": 68}
70
+ {"train_loss": 5.580072529600014, "train_entropy": 3.9698098581709993, "train_KL_div": 1.6102626789884649, "train_lr": 0.0004845041968249057, "train_wd": 0.32349582243341346, "epoch": 69}
71
+ {"train_loss": 5.564518190649575, "train_entropy": 3.962325740045892, "train_KL_div": 1.602192456918113, "train_lr": 0.0004561965793705001, "train_wd": 0.32807034906010163, "epoch": 70, "k-NN": {"10": {"top1": 64.826, "top5": 83.118}, "20": {"top1": 64.742, "top5": 85.09}, "100": {"top1": 62.174, "top5": 86.282}, "200": {"top1": 60.642, "top5": 85.85}}}
72
+ {"train_loss": 5.543964401819859, "train_entropy": 3.947192617997376, "train_KL_div": 1.596771788897475, "train_lr": 0.0004284765743409843, "train_wd": 0.3325382232997603, "epoch": 71}
73
+ {"train_loss": 5.522923958886648, "train_entropy": 3.93417712392857, "train_KL_div": 1.588746837001474, "train_lr": 0.00040137795429254924, "train_wd": 0.3368950358999267, "epoch": 72}
74
+ {"train_loss": 5.502818932936137, "train_entropy": 3.9222174397610066, "train_KL_div": 1.580601500555923, "train_lr": 0.00037493373471949746, "train_wd": 0.3411364872125675, "epoch": 73}
75
+ {"train_loss": 5.480490337757191, "train_entropy": 3.9090208735145313, "train_KL_div": 1.5714694644659812, "train_lr": 0.0003491761338299639, "train_wd": 0.3452583914373148, "epoch": 74}
76
+ {"train_loss": 5.458537823767999, "train_entropy": 3.898369680499709, "train_KL_div": 1.5601681505485883, "train_lr": 0.00032413653329297525, "train_wd": 0.3492566807523421, "epoch": 75}
77
+ {"train_loss": 5.4352101052955275, "train_entropy": 3.8860281843869324, "train_KL_div": 1.5491819323315161, "train_lr": 0.0002998454400047307, "train_wd": 0.3531274093288158, "epoch": 76}
78
+ {"train_loss": 5.409237869315578, "train_entropy": 3.8717542304645187, "train_KL_div": 1.537483637924273, "train_lr": 0.0002763324489206497, "train_wd": 0.3568667572249489, "epoch": 77}
79
+ {"train_loss": 5.3840945629914545, "train_entropy": 3.85965199550246, "train_KL_div": 1.5244425778567188, "train_lr": 0.00025362620699846477, "train_wd": 0.36047103415582465, "epoch": 78}
80
+ {"train_loss": 5.359117448151268, "train_entropy": 3.8476318763230526, "train_KL_div": 1.5114855720347884, "train_lr": 0.00023175437829633274, "train_wd": 0.3639366831352615, "epoch": 79}
81
+ {"train_loss": 5.332594535376705, "train_entropy": 3.834966447602144, "train_KL_div": 1.497628095976062, "train_lr": 0.0002107436102684294, "train_wd": 0.36726028398613086, "epoch": 80, "k-NN": {"10": {"top1": 66.988, "top5": 84.702}, "20": {"top1": 66.904, "top5": 86.692}, "100": {"top1": 64.662, "top5": 87.82}, "200": {"top1": 63.15, "top5": 87.448}}}
82
+ {"train_loss": 5.3052014506635885, "train_entropy": 3.8224457505044978, "train_KL_div": 1.482755704737192, "train_lr": 0.0001906195012991345, "train_wd": 0.37043855671565534, "epoch": 81}
83
+ {"train_loss": 5.27737653492184, "train_entropy": 3.8099125579931887, "train_KL_div": 1.4674639959417286, "train_lr": 0.00017140656951534778, "train_wd": 0.37346836475236883, "epoch": 82}
84
+ {"train_loss": 5.24841359698759, "train_entropy": 3.799110016703561, "train_KL_div": 1.4493035879439744, "train_lr": 0.00015312822291492297, "train_wd": 0.37634671804153147, "epoch": 83}
85
+ {"train_loss": 5.218285476259187, "train_entropy": 3.786083893936285, "train_KL_div": 1.4322015955212322, "train_lr": 0.00013580673084762667, "train_wd": 0.37907077599595584, "epoch": 84}
86
+ {"train_loss": 5.188462105414625, "train_entropy": 3.7755947127329716, "train_KL_div": 1.4128674038784608, "train_lr": 0.00011946319688337392, "train_wd": 0.3816378502993196, "epoch": 85}
87
+ {"train_loss": 5.158432929155843, "train_entropy": 3.7648213382471183, "train_KL_div": 1.3936115991158091, "train_lr": 0.00010411753310077069, "train_wd": 0.3840454075592109, "epoch": 86}
88
+ {"train_loss": 5.128862117461472, "train_entropy": 3.7548088992901536, "train_KL_div": 1.374053223514298, "train_lr": 8.978843582731787e-05, "train_wd": 0.3862910718072828, "epoch": 87}
89
+ {"train_loss": 5.101114791156022, "train_entropy": 3.7469708184014103, "train_KL_div": 1.354143985308101, "train_lr": 7.64933628608153e-05, "train_wd": 0.3883726268440472, "epoch": 88}
90
+ {"train_loss": 5.071964387368423, "train_entropy": 3.7389791289287904, "train_KL_div": 1.332985266677424, "train_lr": 6.42485121997233e-05, "train_wd": 0.3902880184259902, "epoch": 89}
91
+ {"train_loss": 5.044674053871931, "train_entropy": 3.7329133033707778, "train_KL_div": 1.3117607617705402, "train_lr": 5.3068802308397555e-05, "train_wd": 0.39203535629286773, "epoch": 90, "k-NN": {"10": {"top1": 68.408, "top5": 85.728}, "20": {"top1": 68.278, "top5": 87.49}, "100": {"top1": 66.088, "top5": 88.702}, "200": {"top1": 64.706, "top5": 88.426}}}
92
+ {"train_loss": 5.0189986519261485, "train_entropy": 3.728963007994831, "train_KL_div": 1.2900356560186217, "train_lr": 4.2967853941238885e-05, "train_wd": 0.39361291603316473, "epoch": 91}
93
+ {"train_loss": 4.995142575237535, "train_entropy": 3.7255157703940633, "train_KL_div": 1.2696268114900937, "train_lr": 3.395797354790087e-05, "train_wd": 0.39501914078587735, "epoch": 92}
94
+ {"train_loss": 4.972250503216758, "train_entropy": 3.7224846552176705, "train_KL_div": 1.2497658639638198, "train_lr": 2.6050138279776657e-05, "train_wd": 0.3962526427769516, "epoch": 93}
95
+ {"train_loss": 4.952739845437922, "train_entropy": 3.721689432892919, "train_KL_div": 1.2310504175753305, "train_lr": 1.9253982616032022e-05, "train_wd": 0.3973122046888454, "epoch": 94}
96
+ {"train_loss": 4.9359201938528035, "train_entropy": 3.7209787654514925, "train_KL_div": 1.2149414385233805, "train_lr": 1.3577786625475805e-05, "train_wd": 0.39819678086187243, "epoch": 95}
97
+ {"train_loss": 4.921588861538078, "train_entropy": 3.7198439879714886, "train_KL_div": 1.2017448748618575, "train_lr": 9.028465878571444e-06, "train_wd": 0.39890549832614663, "epoch": 96}
98
+ {"train_loss": 4.910402931672172, "train_entropy": 3.718958834274827, "train_KL_div": 1.1914441032400065, "train_lr": 5.611563021879905e-06, "train_wd": 0.399437657663097, "epoch": 97}
99
+ {"train_loss": 4.903731272322446, "train_entropy": 3.719735366936791, "train_KL_div": 1.1839959183187911, "train_lr": 3.3312410251985577e-06, "train_wd": 0.3997927336957028, "epoch": 98}
100
+ {"train_loss": 4.899616672433061, "train_entropy": 3.719795980438752, "train_KL_div": 1.1798206954502175, "train_lr": 2.1902781096236463e-06, "train_wd": 0.3999703760067923, "epoch": 99, "k-NN": {"10": {"top1": 68.66, "top5": 85.832}, "20": {"top1": 68.508, "top5": 87.636}, "100": {"top1": 66.33, "top5": 88.876}, "200": {"top1": 64.93, "top5": 88.53}}}