Bill Psomas commited on
Commit
3f47daa
1 Parent(s): 848db31

update readme

Browse files
Files changed (4) hide show
  1. README.md +54 -0
  2. checkpoint.pth +3 -0
  3. configs.yaml +45 -0
  4. log.txt +100 -0
README.md CHANGED
@@ -1,3 +1,57 @@
1
  ---
2
  license: cc-by-4.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
+ datasets:
4
+ - imagenet-1k
5
+ metrics:
6
+ - accuracy
7
+ pipeline_tag: image-classification
8
+ language:
9
+ - en
10
+ tags:
11
+ - resnet
12
+ - convolutional neural network
13
+ - simpool
14
+ - dino
15
+ - computer vision
16
+ - deep learning
17
  ---
18
+
19
+ # Self-supervised ResNet-50 model with SimPool
20
+
21
+ ResNet-50 model with SimPool (gamma=2.0) trained on ImageNet-1k for 100 epochs. Self-supervision with [DINO](https://arxiv.org/abs/2104.14294).
22
+
23
+ SimPool is a simple attention-based pooling method at the end of network, introduced on this ICCV 2023 [paper](https://arxiv.org/pdf/2309.06891.pdf) and released in this [repository](https://github.com/billpsomas/simpool/).
24
+ Disclaimer: This model card is written by the author of SimPool, i.e. [Bill Psomas](http://users.ntua.gr/psomasbill/).
25
+
26
+ ## Motivation
27
+
28
+ Convolutional networks and vision transformers have different forms of pairwise interactions, pooling across layers and pooling at the end of the network. Does the latter really need to be different?
29
+ As a by-product of pooling, vision transformers provide spatial attention for free, but this is most often of low quality unless self-supervised, which is not well studied. Is supervision really the problem?
30
+
31
+ ## Method
32
+
33
+ SimPool is a simple attention-based pooling mechanism as a replacement of the default one for both convolutional and transformer encoders. For transformers, we completely discard the [CLS] token.
34
+ Interestingly, we find that, whether supervised or self-supervised, SimPool improves performance on pre-training and downstream tasks and provides attention maps delineating object boundaries in all cases.
35
+ One could thus call SimPool universal.
36
+
37
+ ## Evaluation with k-NN
38
+
39
+ | k | top1 | top5 |
40
+ | ------- | ------- | ------- |
41
+ | 10 | 63.828 | 81.82 |
42
+ | 20 | 63.502 | 83.824 |
43
+ | 100 | 60.658 | 84.716 |
44
+ | 200 | 58.66 | 83.846 |
45
+
46
+ ## BibTeX entry and citation info
47
+
48
+ ```
49
+ @misc{psomas2023simpool,
50
+ title={Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?},
51
+ author={Bill Psomas and Ioannis Kakogeorgiou and Konstantinos Karantzalos and Yannis Avrithis},
52
+ year={2023},
53
+ eprint={2309.06891},
54
+ archivePrefix={arXiv},
55
+ primaryClass={cs.CV}
56
+ }
57
+ ```
checkpoint.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b3d901d0f9cd5582abed27c89f41a2968cb5f9a7dc0f58f96c7a32351367430
3
+ size 675750049
configs.yaml ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ arch: resnet50
2
+ backend: nccl
3
+ batch_size_per_gpu: 80
4
+ clip_grad: 0.0
5
+ data_path: /path/to/imagenet/
6
+ dist_url: env://
7
+ drop_path_rate: 0.1
8
+ epochs: 100
9
+ eval_every: 10
10
+ freeze_last_layer: 1
11
+ global_crops_scale:
12
+ - 0.14
13
+ - 1.0
14
+ local_crops_number: 6
15
+ local_crops_scale:
16
+ - 0.05
17
+ - 0.14
18
+ local_rank: 0
19
+ lr: 0.3
20
+ min_lr: 0.0048
21
+ mode: simpool
22
+ momentum_teacher: 0.996
23
+ nb_knn:
24
+ - 10
25
+ - 20
26
+ - 100
27
+ - 200
28
+ norm_last_layer: true
29
+ num_workers: 10
30
+ optimizer: lars
31
+ out_dim: 60000
32
+ output_dir: /path/to/output/
33
+ patch_size: 16
34
+ saveckp_freq: 20
35
+ seed: 0
36
+ subset: -1
37
+ teacher_temp: 0.07
38
+ temperature: 0.07
39
+ use_bn_in_head: true
40
+ use_fp16: false
41
+ warmup_epochs: 10
42
+ warmup_teacher_temp: 0.04
43
+ warmup_teacher_temp_epochs: 50
44
+ weight_decay: 1.0e-06
45
+ weight_decay_end: 1.0e-06
log.txt ADDED
@@ -0,0 +1,100 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {"train_loss": 10.449837056398392, "train_entropy": 9.389393039047718, "train_KL_div": 1.0604439457431436, "train_lr": 0.07493249324932494, "train_wd": 1.000000000000015e-06, "epoch": 0, "k-NN": {"10": {"top1": 2.36, "top5": 5.818}, "20": {"top1": 2.73, "top5": 6.41}, "100": {"top1": 3.178, "top5": 8.764}, "200": {"top1": 3.292, "top5": 9.318}}}
2
+ {"train_loss": 10.167079906046391, "train_entropy": 9.65371017241478, "train_KL_div": 0.5133696808293462, "train_lr": 0.22494749474947498, "train_wd": 1.000000000000015e-06, "epoch": 1}
3
+ {"train_loss": 9.468380255460739, "train_entropy": 8.58747710877657, "train_KL_div": 0.880903076402843, "train_lr": 0.37496249624962497, "train_wd": 1.000000000000015e-06, "epoch": 2}
4
+ {"train_loss": 8.479013512432575, "train_entropy": 7.222656021177769, "train_KL_div": 1.256357355952263, "train_lr": 0.524977497749775, "train_wd": 1.000000000000015e-06, "epoch": 3}
5
+ {"train_loss": 7.515101688563823, "train_entropy": 5.9015126974582675, "train_KL_div": 1.6135889759212731, "train_lr": 0.674992499249925, "train_wd": 1.000000000000015e-06, "epoch": 4}
6
+ {"train_loss": 6.680009725153446, "train_entropy": 4.778078202843666, "train_KL_div": 1.9019315302670001, "train_lr": 0.8250075007500748, "train_wd": 1.000000000000015e-06, "epoch": 5}
7
+ {"train_loss": 6.12774421530962, "train_entropy": 4.047231483012438, "train_KL_div": 2.0805127462893727, "train_lr": 0.9750225022502252, "train_wd": 1.000000000000015e-06, "epoch": 6}
8
+ {"train_loss": 5.788710211277008, "train_entropy": 3.603530557900667, "train_KL_div": 2.1851796337515115, "train_lr": 1.1250375037503753, "train_wd": 1.000000000000015e-06, "epoch": 7}
9
+ {"train_loss": 5.561759056508541, "train_entropy": 3.325290210336447, "train_KL_div": 2.2364688062518834, "train_lr": 1.2750525052505253, "train_wd": 1.000000000000015e-06, "epoch": 8}
10
+ {"train_loss": 5.391559284448624, "train_entropy": 3.138916272819042, "train_KL_div": 2.2526429802775385, "train_lr": 1.4250675067506746, "train_wd": 1.000000000000015e-06, "epoch": 9}
11
+ {"train_loss": 5.2455336748361585, "train_entropy": 3.003100367337465, "train_KL_div": 2.2424333059489725, "train_lr": 1.4998484155601604, "train_wd": 1.000000000000015e-06, "epoch": 10, "k-NN": {"10": {"top1": 40.368, "top5": 60.464}, "20": {"top1": 40.57, "top5": 63.354}, "100": {"top1": 37.946, "top5": 64.804}, "200": {"top1": 36.05, "top5": 63.766}}}
12
+ {"train_loss": 5.124244156062603, "train_entropy": 2.903090033352375, "train_KL_div": 2.2211541321873667, "train_lr": 1.4989382202191224, "train_wd": 1.000000000000015e-06, "epoch": 11}
13
+ {"train_loss": 5.0256589850187305, "train_entropy": 2.8311997632086277, "train_KL_div": 2.1944592456817626, "train_lr": 1.4971184830521438, "train_wd": 1.000000000000015e-06, "epoch": 12}
14
+ {"train_loss": 4.943505420267582, "train_entropy": 2.7756613405048847, "train_KL_div": 2.1678441066741945, "train_lr": 1.494391421128656, "train_wd": 1.000000000000015e-06, "epoch": 13}
15
+ {"train_loss": 4.879734697937965, "train_entropy": 2.7337953796088694, "train_KL_div": 2.1459393372386693, "train_lr": 1.4907603569535401, "train_wd": 1.000000000000015e-06, "epoch": 14}
16
+ {"train_loss": 4.82360738325119, "train_entropy": 2.7040232205688954, "train_KL_div": 2.119584177866578, "train_lr": 1.4862297144191705, "train_wd": 1.000000000000015e-06, "epoch": 15}
17
+ {"train_loss": 4.777354749560356, "train_entropy": 2.681550843566656, "train_KL_div": 2.095803919225931, "train_lr": 1.4808050134155832, "train_wd": 1.000000000000015e-06, "epoch": 16}
18
+ {"train_loss": 4.740407183349133, "train_entropy": 2.666192293405533, "train_KL_div": 2.074214899018407, "train_lr": 1.4744928631053371, "train_wd": 1.000000000000015e-06, "epoch": 17}
19
+ {"train_loss": 4.711565686523914, "train_entropy": 2.658193520128727, "train_KL_div": 2.0533721644580365, "train_lr": 1.4673009538712782, "train_wd": 1.000000000000015e-06, "epoch": 18}
20
+ {"train_loss": 4.689444116234779, "train_entropy": 2.657263245970011, "train_KL_div": 2.0321808568388224, "train_lr": 1.4592380479469773, "train_wd": 1.000000000000015e-06, "epoch": 19}
21
+ {"train_loss": 4.6621909416913985, "train_entropy": 2.6582521418631075, "train_KL_div": 2.0039387890696525, "train_lr": 1.450313968741308, "train_wd": 1.000000000000015e-06, "epoch": 20, "k-NN": {"10": {"top1": 53.63, "top5": 73.67}, "20": {"top1": 53.372, "top5": 76.126}, "100": {"top1": 50.418, "top5": 77.084}, "200": {"top1": 48.432, "top5": 76.132}}}
22
+ {"train_loss": 4.647263663291931, "train_entropy": 2.6619999133348466, "train_KL_div": 1.985263742864132, "train_lr": 1.4405395888701316, "train_wd": 1.000000000000015e-06, "epoch": 21}
23
+ {"train_loss": 4.631990493118763, "train_entropy": 2.669117329120636, "train_KL_div": 1.9628731496036054, "train_lr": 1.4299268169096957, "train_wd": 1.000000000000015e-06, "epoch": 22}
24
+ {"train_loss": 4.619528388857842, "train_entropy": 2.6794597724974154, "train_KL_div": 1.9400685989707709, "train_lr": 1.4184885828878597, "train_wd": 1.000000000000015e-06, "epoch": 23}
25
+ {"train_loss": 4.611837514638901, "train_entropy": 2.6918885149657727, "train_KL_div": 1.919948979064822, "train_lr": 1.4062388225308553, "train_wd": 1.000000000000015e-06, "epoch": 24}
26
+ {"train_loss": 4.603427241146565, "train_entropy": 2.7055979170501234, "train_KL_div": 1.8978293050229549, "train_lr": 1.39319246028475, "train_wd": 1.000000000000015e-06, "epoch": 25}
27
+ {"train_loss": 4.5979758430719375, "train_entropy": 2.72082040977478, "train_KL_div": 1.87715540663898, "train_lr": 1.3793653911322932, "train_wd": 1.000000000000015e-06, "epoch": 26}
28
+ {"train_loss": 4.593641734242439, "train_entropy": 2.7375692039728166, "train_KL_div": 1.8560725199133157, "train_lr": 1.3647744612273618, "train_wd": 1.000000000000015e-06, "epoch": 27}
29
+ {"train_loss": 4.595699313998223, "train_entropy": 2.757804524928331, "train_KL_div": 1.8378947650045157, "train_lr": 1.3494374473704784, "train_wd": 1.000000000000015e-06, "epoch": 28}
30
+ {"train_loss": 4.596212774693966, "train_entropy": 2.7804619541168214, "train_KL_div": 1.815750805452466, "train_lr": 1.3333730353505442, "train_wd": 1.000000000000015e-06, "epoch": 29}
31
+ {"train_loss": 4.598976962089538, "train_entropy": 2.8030542490184307, "train_KL_div": 1.795922696441412, "train_lr": 1.3166007971790659, "train_wd": 1.000000000000015e-06, "epoch": 30, "k-NN": {"10": {"top1": 58.68, "top5": 78.158}, "20": {"top1": 58.574, "top5": 80.422}, "100": {"top1": 55.572, "top5": 81.364}, "200": {"top1": 53.522, "top5": 80.32}}}
32
+ {"train_loss": 4.601598667144775, "train_entropy": 2.828574624478817, "train_KL_div": 1.7730240271687507, "train_lr": 1.299141167244701, "train_wd": 1.000000000000015e-06, "epoch": 31}
33
+ {"train_loss": 4.606501205146313, "train_entropy": 2.8556448546946047, "train_KL_div": 1.750856324851513, "train_lr": 1.2810154174170678, "train_wd": 1.000000000000015e-06, "epoch": 32}
34
+ {"train_loss": 4.618217970252037, "train_entropy": 2.884929783701897, "train_KL_div": 1.7332881642729043, "train_lr": 1.2622456311302719, "train_wd": 1.000000000000015e-06, "epoch": 33}
35
+ {"train_loss": 4.624960979044437, "train_entropy": 2.9153065445423127, "train_KL_div": 1.7096543975025416, "train_lr": 1.242854676477644, "train_wd": 1.000000000000015e-06, "epoch": 34}
36
+ {"train_loss": 4.634613274753094, "train_entropy": 2.946100403189659, "train_KL_div": 1.6885128435194492, "train_lr": 1.222866178350482, "train_wd": 1.000000000000015e-06, "epoch": 35}
37
+ {"train_loss": 4.647865023553371, "train_entropy": 2.9811553439497946, "train_KL_div": 1.6667096441090108, "train_lr": 1.2023044896547554, "train_wd": 1.000000000000015e-06, "epoch": 36}
38
+ {"train_loss": 4.661201903760433, "train_entropy": 3.016474710315466, "train_KL_div": 1.6447271532565355, "train_lr": 1.181194661640857, "train_wd": 1.000000000000015e-06, "epoch": 37}
39
+ {"train_loss": 4.6744207764863965, "train_entropy": 3.0524465629458426, "train_KL_div": 1.6219741736352444, "train_lr": 1.1595624133825075, "train_wd": 1.000000000000015e-06, "epoch": 38}
40
+ {"train_loss": 4.693183571636677, "train_entropy": 3.09200253623724, "train_KL_div": 1.6011809964329005, "train_lr": 1.1374341004420114, "train_wd": 1.000000000000015e-06, "epoch": 39}
41
+ {"train_loss": 4.711371740698814, "train_entropy": 3.1325233362019063, "train_KL_div": 1.578848370999098, "train_lr": 1.114836682760085, "train_wd": 1.000000000000015e-06, "epoch": 40, "k-NN": {"10": {"top1": 61.32, "top5": 80.242}, "20": {"top1": 61.088, "top5": 82.392}, "100": {"top1": 58.196, "top5": 83.298}, "200": {"top1": 56.278, "top5": 82.364}}}
42
+ {"train_loss": 4.730860756337643, "train_entropy": 3.175362041980028, "train_KL_div": 1.55549867272377, "train_lr": 1.0917976918093049, "train_wd": 1.000000000000015e-06, "epoch": 41}
43
+ {"train_loss": 4.754233127295971, "train_entropy": 3.220520591288805, "train_KL_div": 1.533712494507432, "train_lr": 1.0683451970512654, "train_wd": 1.000000000000015e-06, "epoch": 42}
44
+ {"train_loss": 4.78042931753397, "train_entropy": 3.269156540900469, "train_KL_div": 1.5112727368921042, "train_lr": 1.0445077717382412, "train_wd": 1.000000000000015e-06, "epoch": 43}
45
+ {"train_loss": 4.8067700149416925, "train_entropy": 3.3187265184819696, "train_KL_div": 1.4880434669405223, "train_lr": 1.0203144581011085, "train_wd": 1.000000000000015e-06, "epoch": 44}
46
+ {"train_loss": 4.837271986544132, "train_entropy": 3.3728389540314674, "train_KL_div": 1.4644330016374587, "train_lr": 0.9957947319658386, "train_wd": 1.000000000000015e-06, "epoch": 45}
47
+ {"train_loss": 4.867743356406689, "train_entropy": 3.427095182299614, "train_KL_div": 1.440648139283061, "train_lr": 0.9709784668417526, "train_wd": 1.000000000000015e-06, "epoch": 46}
48
+ {"train_loss": 4.902647614359855, "train_entropy": 3.485941599428654, "train_KL_div": 1.4167059880495072, "train_lr": 0.9458958975252506, "train_wd": 1.000000000000015e-06, "epoch": 47}
49
+ {"train_loss": 4.936469689190388, "train_entropy": 3.544396564155817, "train_KL_div": 1.392073102414608, "train_lr": 0.9205775832633725, "train_wd": 1.000000000000015e-06, "epoch": 48}
50
+ {"train_loss": 4.9753448957204816, "train_entropy": 3.606009334564209, "train_KL_div": 1.3693355347663163, "train_lr": 0.8950543705220573, "train_wd": 1.000000000000015e-06, "epoch": 49}
51
+ {"train_loss": 4.968914308726788, "train_entropy": 3.60816898432374, "train_KL_div": 1.3607453027963639, "train_lr": 0.8693573554044859, "train_wd": 1.000000000000015e-06, "epoch": 50, "k-NN": {"10": {"top1": 62.606, "top5": 81.258}, "20": {"top1": 62.51, "top5": 83.434}, "100": {"top1": 59.706, "top5": 84.204}, "200": {"top1": 57.664, "top5": 83.348}}}
52
+ {"train_loss": 4.963112980544567, "train_entropy": 3.6079727408289908, "train_KL_div": 1.3551402306109668, "train_lr": 0.8435178457652498, "train_wd": 1.000000000000015e-06, "epoch": 51}
53
+ {"train_loss": 4.955107469856739, "train_entropy": 3.605429495513439, "train_KL_div": 1.3496779587417842, "train_lr": 0.81756732306658, "train_wd": 1.000000000000015e-06, "epoch": 52}
54
+ {"train_loss": 4.947526503682137, "train_entropy": 3.6032412466406822, "train_KL_div": 1.3442852396517992, "train_lr": 0.7915374040230085, "train_wd": 1.000000000000015e-06, "epoch": 53}
55
+ {"train_loss": 4.939839230179786, "train_entropy": 3.5994246728420256, "train_KL_div": 1.340414534404874, "train_lr": 0.7654598020812908, "train_wd": 1.000000000000015e-06, "epoch": 54}
56
+ {"train_loss": 4.929350660145283, "train_entropy": 3.5949827539026735, "train_KL_div": 1.334367891728878, "train_lr": 0.739366288782445, "train_wd": 1.000000000000015e-06, "epoch": 55}
57
+ {"train_loss": 4.918193859159946, "train_entropy": 3.589451118350029, "train_KL_div": 1.3287427161484957, "train_lr": 0.7132886550530276, "train_wd": 1.000000000000015e-06, "epoch": 56}
58
+ {"train_loss": 4.908644145309925, "train_entropy": 3.5840909039676188, "train_KL_div": 1.324553216382861, "train_lr": 0.6872586724727882, "train_wd": 1.000000000000015e-06, "epoch": 57}
59
+ {"train_loss": 4.898883959114552, "train_entropy": 3.5798143591284752, "train_KL_div": 1.3190695780962705, "train_lr": 0.661308054565888, "train_wd": 1.000000000000015e-06, "epoch": 58}
60
+ {"train_loss": 4.887877831161022, "train_entropy": 3.5738284061849117, "train_KL_div": 1.3140494076013565, "train_lr": 0.6354684181628609, "train_wd": 1.000000000000015e-06, "epoch": 59}
61
+ {"train_loss": 4.876774698376655, "train_entropy": 3.5685866465270517, "train_KL_div": 1.308188029706478, "train_lr": 0.6097712448803728, "train_wd": 1.000000000000015e-06, "epoch": 60, "k-NN": {"10": {"top1": 63.514, "top5": 81.712}, "20": {"top1": 63.278, "top5": 83.788}, "100": {"top1": 60.484, "top5": 84.738}, "200": {"top1": 58.544, "top5": 83.858}}}
62
+ {"train_loss": 4.867280040442943, "train_entropy": 3.5640356072187425, "train_KL_div": 1.3032444142103194, "train_lr": 0.5842478427657233, "train_wd": 1.000000000000015e-06, "epoch": 61}
63
+ {"train_loss": 4.855790178358554, "train_entropy": 3.557568785637617, "train_KL_div": 1.2982213801592588, "train_lr": 0.5589293081528082, "train_wd": 1.000000000000015e-06, "epoch": 62}
64
+ {"train_loss": 4.847070850729942, "train_entropy": 3.553624374985695, "train_KL_div": 1.293446442693472, "train_lr": 0.5338464877760347, "train_wd": 1.000000000000015e-06, "epoch": 63}
65
+ {"train_loss": 4.8365606622695925, "train_entropy": 3.548783615887165, "train_KL_div": 1.2877770300209521, "train_lr": 0.5090299411883176, "train_wd": 1.000000000000015e-06, "epoch": 64}
66
+ {"train_loss": 4.826163962960243, "train_entropy": 3.54379882606864, "train_KL_div": 1.2823651144653558, "train_lr": 0.48450990352897855, "train_wd": 1.000000000000015e-06, "epoch": 65}
67
+ {"train_loss": 4.8161005507111545, "train_entropy": 3.538640930235386, "train_KL_div": 1.2774595837146043, "train_lr": 0.4603162486868846, "train_wd": 1.000000000000015e-06, "epoch": 66}
68
+ {"train_loss": 4.804622949123383, "train_entropy": 3.533218850046396, "train_KL_div": 1.2714040691405535, "train_lr": 0.4364784529037116, "train_wd": 1.000000000000015e-06, "epoch": 67}
69
+ {"train_loss": 4.795523640930653, "train_entropy": 3.529409927397966, "train_KL_div": 1.2661136866286398, "train_lr": 0.41302555886169284, "train_wd": 1.000000000000015e-06, "epoch": 68}
70
+ {"train_loss": 4.785011376559734, "train_entropy": 3.525017201423645, "train_KL_div": 1.2599941370785237, "train_lr": 0.38998614029957523, "train_wd": 1.000000000000015e-06, "epoch": 69}
71
+ {"train_loss": 4.775844187557698, "train_entropy": 3.5197538111507893, "train_KL_div": 1.2560903428643941, "train_lr": 0.36738826719992773, "train_wd": 1.000000000000015e-06, "epoch": 70, "k-NN": {"10": {"top1": 63.702, "top5": 81.938}, "20": {"top1": 63.478, "top5": 83.982}, "100": {"top1": 60.828, "top5": 84.75}, "200": {"top1": 58.816, "top5": 83.878}}}
72
+ {"train_loss": 4.766067259669304, "train_entropy": 3.5165095616281032, "train_KL_div": 1.2495576706379652, "train_lr": 0.34525947159018555, "train_wd": 1.000000000000015e-06, "epoch": 71}
73
+ {"train_loss": 4.757967102885246, "train_entropy": 3.511660830795765, "train_KL_div": 1.24630623729527, "train_lr": 0.32362671399911985, "train_wd": 1.000000000000015e-06, "epoch": 72}
74
+ {"train_loss": 4.747447923123836, "train_entropy": 3.50767432564497, "train_KL_div": 1.2397735749036074, "train_lr": 0.30251635060958443, "train_wd": 1.000000000000015e-06, "epoch": 73}
75
+ {"train_loss": 4.7385219835639, "train_entropy": 3.503481776356697, "train_KL_div": 1.2350401680916547, "train_lr": 0.2819541011475686, "train_wd": 1.000000000000015e-06, "epoch": 74}
76
+ {"train_loss": 4.729638677358627, "train_entropy": 3.500288457751274, "train_KL_div": 1.2293501847088337, "train_lr": 0.26196501754666823, "train_wd": 1.000000000000015e-06, "epoch": 75}
77
+ {"train_loss": 4.721902140021324, "train_entropy": 3.497050025075674, "train_KL_div": 1.224852084979415, "train_lr": 0.24257345342617007, "train_wd": 1.000000000000015e-06, "epoch": 76}
78
+ {"train_loss": 4.715184380650521, "train_entropy": 3.49575875338912, "train_KL_div": 1.2194255896955728, "train_lr": 0.2238030344199124, "train_wd": 1.000000000000015e-06, "epoch": 77}
79
+ {"train_loss": 4.707324535787105, "train_entropy": 3.492444661796093, "train_KL_div": 1.2148798391669988, "train_lr": 0.20567662939209372, "train_wd": 1.000000000000015e-06, "epoch": 78}
80
+ {"train_loss": 4.701352820754051, "train_entropy": 3.491047471255064, "train_KL_div": 1.2103053124994039, "train_lr": 0.18821632257508172, "train_wd": 1.000000000000015e-06, "epoch": 79}
81
+ {"train_loss": 4.694893266558648, "train_entropy": 3.489258858293295, "train_KL_div": 1.2056343666762113, "train_lr": 0.1714433866631785, "train_wd": 1.000000000000015e-06, "epoch": 80, "k-NN": {"10": {"top1": 63.842, "top5": 82.0}, "20": {"top1": 63.654, "top5": 83.898}, "100": {"top1": 60.766, "top5": 84.776}, "200": {"top1": 58.784, "top5": 83.946}}}
82
+ {"train_loss": 4.689299988627433, "train_entropy": 3.48850383323431, "train_KL_div": 1.2007961137667298, "train_lr": 0.1553782568951199, "train_wd": 1.000000000000015e-06, "epoch": 81}
83
+ {"train_loss": 4.684984575450421, "train_entropy": 3.488232976526022, "train_KL_div": 1.196751569621265, "train_lr": 0.14004050615688546, "train_wd": 1.000000000000015e-06, "epoch": 82}
84
+ {"train_loss": 4.67952576893568, "train_entropy": 3.4863946334719658, "train_KL_div": 1.1931311101168394, "train_lr": 0.1254488211351497, "train_wd": 1.000000000000015e-06, "epoch": 83}
85
+ {"train_loss": 4.674294273018837, "train_entropy": 3.486509329199791, "train_KL_div": 1.1877849027067422, "train_lr": 0.11162097955043498, "train_wd": 1.000000000000015e-06, "epoch": 84}
86
+ {"train_loss": 4.67085353410244, "train_entropy": 3.4866335148513317, "train_KL_div": 1.1842199771180748, "train_lr": 0.09857382849769691, "train_wd": 1.000000000000015e-06, "epoch": 85}
87
+ {"train_loss": 4.667921773612499, "train_entropy": 3.486697638005018, "train_KL_div": 1.1812240997850896, "train_lr": 0.0863232639207329, "train_wd": 1.000000000000015e-06, "epoch": 86}
88
+ {"train_loss": 4.663681262671948, "train_entropy": 3.486406087011099, "train_KL_div": 1.177275151245296, "train_lr": 0.07488421124542628, "train_wd": 1.000000000000015e-06, "epoch": 87}
89
+ {"train_loss": 4.6620805332064625, "train_entropy": 3.4878364756703375, "train_KL_div": 1.1742440226003528, "train_lr": 0.06427060719540977, "train_wd": 1.000000000000015e-06, "epoch": 88}
90
+ {"train_loss": 4.658433960437774, "train_entropy": 3.486937234252691, "train_KL_div": 1.1714966863766312, "train_lr": 0.054495382812318714, "train_wd": 1.000000000000015e-06, "epoch": 89}
91
+ {"train_loss": 4.655751956164837, "train_entropy": 3.487239393979311, "train_KL_div": 1.1685125179886817, "train_lr": 0.04557044770130586, "train_wd": 1.000000000000015e-06, "epoch": 90, "k-NN": {"10": {"top1": 63.758, "top5": 81.838}, "20": {"top1": 63.522, "top5": 83.82}, "100": {"top1": 60.678, "top5": 84.75}, "200": {"top1": 58.682, "top5": 83.844}}}
92
+ {"train_loss": 4.654251498639583, "train_entropy": 3.488241710752249, "train_KL_div": 1.1660097463428973, "train_lr": 0.03750667552102294, "train_wd": 1.000000000000015e-06, "epoch": 91}
93
+ {"train_loss": 4.653276021361351, "train_entropy": 3.4884550351798533, "train_KL_div": 1.1648209442570805, "train_lr": 0.030313890735742997, "train_wd": 1.000000000000015e-06, "epoch": 92}
94
+ {"train_loss": 4.651876353561878, "train_entropy": 3.488577649265528, "train_KL_div": 1.1632986719980836, "train_lr": 0.024000856645763145, "train_wd": 1.000000000000015e-06, "epoch": 93}
95
+ {"train_loss": 4.649052367269993, "train_entropy": 3.488668581187725, "train_KL_div": 1.1603837516382336, "train_lr": 0.018575264710673736, "train_wd": 1.000000000000015e-06, "epoch": 94}
96
+ {"train_loss": 4.6480642236471175, "train_entropy": 3.489142097502947, "train_KL_div": 1.1589220889136196, "train_lr": 0.014043725178499274, "train_wd": 1.000000000000015e-06, "epoch": 95}
97
+ {"train_loss": 4.64655362045765, "train_entropy": 3.4882131513655183, "train_KL_div": 1.158340444765985, "train_lr": 0.01041175903212961, "train_wd": 1.000000000000015e-06, "epoch": 96}
98
+ {"train_loss": 4.646539884030819, "train_entropy": 3.489233550876379, "train_KL_div": 1.1573062996640802, "train_lr": 0.007683791262852558, "train_wd": 1.000000000000015e-06, "epoch": 97}
99
+ {"train_loss": 4.645518612086773, "train_entropy": 3.4883745178878307, "train_KL_div": 1.1571440563127398, "train_lr": 0.005863145479183758, "train_wd": 1.000000000000015e-06, "epoch": 98}
100
+ {"train_loss": 4.644806251823902, "train_entropy": 3.488529181241989, "train_KL_div": 1.1562770416736603, "train_lr": 0.004952039857561666, "train_wd": 1.000000000000015e-06, "epoch": 99, "k-NN": {"10": {"top1": 63.828, "top5": 81.82}, "20": {"top1": 63.502, "top5": 83.824}, "100": {"top1": 60.658, "top5": 84.716}, "200": {"top1": 58.66, "top5": 83.846}}}