yoshitomo-matsubara commited on
Commit
0cdf304
1 Parent(s): 14a227a

tuned hyperparameters

Browse files
Files changed (3) hide show
  1. pytorch_model.bin +1 -1
  2. tokenizer.json +0 -0
  3. training.log +57 -50
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7b22983e839da805d45db0118bb6393a82724bc9cb3a089ab14888b70fa086f8
3
  size 1340746825
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9e1294663f653252c3a4b39410e506695cdb706975378e0773313e2a2b950f16
3
  size 1340746825
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
training.log CHANGED
@@ -1,56 +1,63 @@
1
- 2021-05-21 19:52:16,903 INFO __main__ Namespace(adjust_lr=False, config='torchdistill/configs/sample/glue/cola/ce/bert_large_uncased.yaml', log='log/glue/cola/ce/bert_large_uncased.txt', private_output='leaderboard/glue/standard/bert_large_uncased/', seed=None, student_only=False, task_name='cola', test_only=False, world_size=1)
2
- 2021-05-21 19:52:16,964 INFO __main__ Distributed environment: NO
3
  Num processes: 1
4
  Process index: 0
5
  Local process index: 0
6
  Device: cuda
7
  Use FP16 precision: True
8
 
9
- 2021-05-21 19:52:17,253 INFO filelock Lock 139707469705104 acquired on /root/.cache/huggingface/transformers/1cf090f220f9674b67b3434decfe4d40a6532d7849653eac435ff94d31a4904c.1d03e5e4fa2db2532c517b2cd98290d8444b237619bd3d2039850a6d5e86473d.lock
10
- 2021-05-21 19:52:17,525 INFO filelock Lock 139707469705104 released on /root/.cache/huggingface/transformers/1cf090f220f9674b67b3434decfe4d40a6532d7849653eac435ff94d31a4904c.1d03e5e4fa2db2532c517b2cd98290d8444b237619bd3d2039850a6d5e86473d.lock
11
- 2021-05-21 19:52:18,060 INFO filelock Lock 139707469840208 acquired on /root/.cache/huggingface/transformers/e12f02d630da91a0982ce6db1ad595231d155a2b725ab106971898276d842ecc.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99.lock
12
- 2021-05-21 19:52:18,665 INFO filelock Lock 139707469840208 released on /root/.cache/huggingface/transformers/e12f02d630da91a0982ce6db1ad595231d155a2b725ab106971898276d842ecc.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99.lock
13
- 2021-05-21 19:52:18,932 INFO filelock Lock 139707469840208 acquired on /root/.cache/huggingface/transformers/475d46024228961ca8770cead39e1079f135fd2441d14cf216727ffac8d41d78.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4.lock
14
- 2021-05-21 19:52:19,623 INFO filelock Lock 139707469840208 released on /root/.cache/huggingface/transformers/475d46024228961ca8770cead39e1079f135fd2441d14cf216727ffac8d41d78.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4.lock
15
- 2021-05-21 19:52:20,432 INFO filelock Lock 139707469872272 acquired on /root/.cache/huggingface/transformers/300ecd79785b4602752c0085f8a89c3f0232ef367eda291c79a5600f3778b677.20430bd8e10ef77a7d2977accefe796051e01bc2fc4aa146bc862997a1a15e79.lock
16
- 2021-05-21 19:52:20,702 INFO filelock Lock 139707469872272 released on /root/.cache/huggingface/transformers/300ecd79785b4602752c0085f8a89c3f0232ef367eda291c79a5600f3778b677.20430bd8e10ef77a7d2977accefe796051e01bc2fc4aa146bc862997a1a15e79.lock
17
- 2021-05-21 19:52:20,993 INFO filelock Lock 139707502187088 acquired on /root/.cache/huggingface/transformers/1d959166dd7e047e57ea1b2d9b7b9669938a7e90c5e37a03961ad9f15eaea17f.fea64cd906e3766b04c92397f9ad3ff45271749cbe49829a079dd84e34c1697d.lock
18
- 2021-05-21 19:52:43,690 INFO filelock Lock 139707502187088 released on /root/.cache/huggingface/transformers/1d959166dd7e047e57ea1b2d9b7b9669938a7e90c5e37a03961ad9f15eaea17f.fea64cd906e3766b04c92397f9ad3ff45271749cbe49829a079dd84e34c1697d.lock
19
- 2021-05-21 19:52:53,493 INFO __main__ Start training
20
- 2021-05-21 19:52:53,493 INFO torchdistill.models.util [student model]
21
- 2021-05-21 19:52:53,493 INFO torchdistill.models.util Using the original student model
22
- 2021-05-21 19:52:53,493 INFO torchdistill.core.training Loss = 1.0 * OrgLoss
23
- 2021-05-21 19:52:59,798 INFO torchdistill.misc.log Epoch: [0] [ 0/268] eta: 0:02:14 lr: 1.9975124378109453e-05 sample/s: 8.285208596713021 loss: 0.7285 (0.7285) time: 0.5037 data: 0.0209 max mem: 5355
24
- 2021-05-21 19:53:18,618 INFO torchdistill.misc.log Epoch: [0] [ 50/268] eta: 0:01:22 lr: 1.873134328358209e-05 sample/s: 11.48804134737782 loss: 0.5990 (0.6344) time: 0.3712 data: 0.0031 max mem: 7403
25
- 2021-05-21 19:53:37,548 INFO torchdistill.misc.log Epoch: [0] [100/268] eta: 0:01:03 lr: 1.7487562189054726e-05 sample/s: 9.439338529019249 loss: 0.4860 (0.5769) time: 0.3813 data: 0.0029 max mem: 7403
26
- 2021-05-21 19:53:56,330 INFO torchdistill.misc.log Epoch: [0] [150/268] eta: 0:00:44 lr: 1.6243781094527366e-05 sample/s: 11.48365361250954 loss: 0.4258 (0.5312) time: 0.3704 data: 0.0029 max mem: 7403
27
- 2021-05-21 19:54:14,604 INFO torchdistill.misc.log Epoch: [0] [200/268] eta: 0:00:25 lr: 1.5000000000000002e-05 sample/s: 9.44590729538729 loss: 0.3916 (0.4993) time: 0.3633 data: 0.0030 max mem: 7403
28
- 2021-05-21 19:54:33,394 INFO torchdistill.misc.log Epoch: [0] [250/268] eta: 0:00:06 lr: 1.3756218905472638e-05 sample/s: 8.100720542983192 loss: 0.4126 (0.4791) time: 0.3930 data: 0.0030 max mem: 7403
29
- 2021-05-21 19:54:39,779 INFO torchdistill.misc.log Epoch: [0] Total time: 0:01:40
30
- 2021-05-21 19:54:42,894 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/cola/default_experiment-1-0.arrow
31
- 2021-05-21 19:54:42,894 INFO __main__ Validation: matthews_correlation = 0.5755578350293411
32
- 2021-05-21 19:54:42,894 INFO __main__ Updating ckpt
33
- 2021-05-21 19:54:47,852 INFO torchdistill.misc.log Epoch: [1] [ 0/268] eta: 0:01:37 lr: 1.3308457711442788e-05 sample/s: 11.084181081356062 loss: 0.2307 (0.2307) time: 0.3650 data: 0.0042 max mem: 7826
34
- 2021-05-21 19:55:06,275 INFO torchdistill.misc.log Epoch: [1] [ 50/268] eta: 0:01:20 lr: 1.2064676616915423e-05 sample/s: 11.449492398950676 loss: 0.2716 (0.2880) time: 0.3855 data: 0.0030 max mem: 7826
35
- 2021-05-21 19:55:25,586 INFO torchdistill.misc.log Epoch: [1] [100/268] eta: 0:01:03 lr: 1.082089552238806e-05 sample/s: 11.479088581543106 loss: 0.2913 (0.2842) time: 0.3964 data: 0.0030 max mem: 7829
36
- 2021-05-21 19:55:44,446 INFO torchdistill.misc.log Epoch: [1] [150/268] eta: 0:00:44 lr: 9.577114427860697e-06 sample/s: 11.456959277340616 loss: 0.2614 (0.2788) time: 0.3965 data: 0.0031 max mem: 7829
37
- 2021-05-21 19:56:02,782 INFO torchdistill.misc.log Epoch: [1] [200/268] eta: 0:00:25 lr: 8.333333333333334e-06 sample/s: 11.499151471285733 loss: 0.2638 (0.2769) time: 0.3481 data: 0.0030 max mem: 7829
38
- 2021-05-21 19:56:21,192 INFO torchdistill.misc.log Epoch: [1] [250/268] eta: 0:00:06 lr: 7.089552238805971e-06 sample/s: 11.464639589773867 loss: 0.2114 (0.2705) time: 0.3626 data: 0.0029 max mem: 7829
39
- 2021-05-21 19:56:27,657 INFO torchdistill.misc.log Epoch: [1] Total time: 0:01:40
40
- 2021-05-21 19:56:30,765 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/cola/default_experiment-1-0.arrow
41
- 2021-05-21 19:56:30,766 INFO __main__ Validation: matthews_correlation = 0.5534186082067715
42
- 2021-05-21 19:56:31,119 INFO torchdistill.misc.log Epoch: [2] [ 0/268] eta: 0:01:34 lr: 6.64179104477612e-06 sample/s: 11.464655258421223 loss: 0.1139 (0.1139) time: 0.3523 data: 0.0034 max mem: 7829
43
- 2021-05-21 19:56:49,610 INFO torchdistill.misc.log Epoch: [2] [ 50/268] eta: 0:01:20 lr: 5.398009950248757e-06 sample/s: 9.450781311612081 loss: 0.1094 (0.1528) time: 0.3744 data: 0.0030 max mem: 7829
44
- 2021-05-21 19:57:08,112 INFO torchdistill.misc.log Epoch: [2] [100/268] eta: 0:01:02 lr: 4.1542288557213935e-06 sample/s: 11.468636482331405 loss: 0.1260 (0.1527) time: 0.3559 data: 0.0030 max mem: 7829
45
- 2021-05-21 19:57:26,908 INFO torchdistill.misc.log Epoch: [2] [150/268] eta: 0:00:43 lr: 2.9104477611940303e-06 sample/s: 11.340868578891262 loss: 0.1274 (0.1475) time: 0.3707 data: 0.0031 max mem: 7829
46
- 2021-05-21 19:57:45,674 INFO torchdistill.misc.log Epoch: [2] [200/268] eta: 0:00:25 lr: 1.6666666666666667e-06 sample/s: 9.455734970044356 loss: 0.1542 (0.1513) time: 0.3704 data: 0.0030 max mem: 7829
47
- 2021-05-21 19:58:04,690 INFO torchdistill.misc.log Epoch: [2] [250/268] eta: 0:00:06 lr: 4.2288557213930354e-07 sample/s: 11.481366714661908 loss: 0.0847 (0.1528) time: 0.3822 data: 0.0031 max mem: 7829
48
- 2021-05-21 19:58:10,794 INFO torchdistill.misc.log Epoch: [2] Total time: 0:01:40
49
- 2021-05-21 19:58:13,901 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/cola/default_experiment-1-0.arrow
50
- 2021-05-21 19:58:13,902 INFO __main__ Validation: matthews_correlation = 0.6008507462939041
51
- 2021-05-21 19:58:13,902 INFO __main__ Updating ckpt
52
- 2021-05-21 19:58:24,318 INFO __main__ [Student: bert-large-uncased]
53
- 2021-05-21 19:58:27,448 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/cola/default_experiment-1-0.arrow
54
- 2021-05-21 19:58:27,449 INFO __main__ Test: matthews_correlation = 0.6008507462939041
55
- 2021-05-21 19:58:27,449 INFO __main__ Start prediction for private dataset(s)
56
- 2021-05-21 19:58:27,450 INFO __main__ cola/test: 1063 samples
 
 
 
 
 
 
 
1
+ 2021-05-25 22:10:47,882 INFO __main__ Namespace(adjust_lr=False, config='torchdistill/configs/sample/glue/cola/ce/bert_large_uncased.yaml', log='log/glue/cola/ce/bert_large_uncased.txt', private_output='leaderboard/glue/standard/bert_large_uncased/', seed=None, student_only=False, task_name='cola', test_only=False, world_size=1)
2
+ 2021-05-25 22:10:47,924 INFO __main__ Distributed environment: NO
3
  Num processes: 1
4
  Process index: 0
5
  Local process index: 0
6
  Device: cuda
7
  Use FP16 precision: True
8
 
9
+ 2021-05-25 22:11:19,057 WARNING datasets.builder Reusing dataset glue (/root/.cache/huggingface/datasets/glue/cola/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
10
+ 2021-05-25 22:11:21,447 INFO __main__ Start training
11
+ 2021-05-25 22:11:21,448 INFO torchdistill.models.util [student model]
12
+ 2021-05-25 22:11:21,448 INFO torchdistill.models.util Using the original student model
13
+ 2021-05-25 22:11:21,448 INFO torchdistill.core.training Loss = 1.0 * OrgLoss
14
+ 2021-05-25 22:11:26,160 INFO torchdistill.misc.log Epoch: [0] [ 0/535] eta: 0:03:19 lr: 1.998753894080997e-05 sample/s: 11.577144663106461 loss: 1.2302 (1.2302) time: 0.3729 data: 0.0273 max mem: 2567
15
+ 2021-05-25 22:11:38,497 INFO torchdistill.misc.log Epoch: [0] [ 50/535] eta: 0:02:00 lr: 1.936448598130841e-05 sample/s: 18.961848487999386 loss: 0.5031 (0.6982) time: 0.2536 data: 0.0022 max mem: 6363
16
+ 2021-05-25 22:11:51,105 INFO torchdistill.misc.log Epoch: [0] [100/535] eta: 0:01:49 lr: 1.8741433021806853e-05 sample/s: 15.993304194887585 loss: 0.4494 (0.5958) time: 0.2486 data: 0.0021 max mem: 6582
17
+ 2021-05-25 22:12:03,585 INFO torchdistill.misc.log Epoch: [0] [150/535] eta: 0:01:36 lr: 1.8118380062305295e-05 sample/s: 18.00978787218443 loss: 0.4563 (0.5483) time: 0.2487 data: 0.0022 max mem: 6582
18
+ 2021-05-25 22:12:16,299 INFO torchdistill.misc.log Epoch: [0] [200/535] eta: 0:01:24 lr: 1.749532710280374e-05 sample/s: 14.060560956947262 loss: 0.4476 (0.5233) time: 0.2532 data: 0.0024 max mem: 6582
19
+ 2021-05-25 22:12:28,756 INFO torchdistill.misc.log Epoch: [0] [250/535] eta: 0:01:11 lr: 1.6872274143302183e-05 sample/s: 15.967990192999187 loss: 0.4905 (0.5099) time: 0.2481 data: 0.0022 max mem: 6588
20
+ 2021-05-25 22:12:41,353 INFO torchdistill.misc.log Epoch: [0] [300/535] eta: 0:00:58 lr: 1.6249221183800625e-05 sample/s: 15.929238896579488 loss: 0.3546 (0.4883) time: 0.2483 data: 0.0024 max mem: 6588
21
+ 2021-05-25 22:12:54,011 INFO torchdistill.misc.log Epoch: [0] [350/535] eta: 0:00:46 lr: 1.5626168224299067e-05 sample/s: 15.80400535051527 loss: 0.3522 (0.4753) time: 0.2495 data: 0.0023 max mem: 6588
22
+ 2021-05-25 22:13:06,660 INFO torchdistill.misc.log Epoch: [0] [400/535] eta: 0:00:33 lr: 1.500311526479751e-05 sample/s: 15.933353720267208 loss: 0.4134 (0.4657) time: 0.2548 data: 0.0022 max mem: 6588
23
+ 2021-05-25 22:13:19,462 INFO torchdistill.misc.log Epoch: [0] [450/535] eta: 0:00:21 lr: 1.4380062305295952e-05 sample/s: 14.186444877192478 loss: 0.3114 (0.4558) time: 0.2487 data: 0.0022 max mem: 6789
24
+ 2021-05-25 22:13:32,122 INFO torchdistill.misc.log Epoch: [0] [500/535] eta: 0:00:08 lr: 1.3757009345794394e-05 sample/s: 15.952928199910618 loss: 0.3921 (0.4485) time: 0.2496 data: 0.0022 max mem: 6789
25
+ 2021-05-25 22:13:40,793 INFO torchdistill.misc.log Epoch: [0] Total time: 0:02:15
26
+ 2021-05-25 22:13:44,271 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/cola/default_experiment-1-0.arrow
27
+ 2021-05-25 22:13:44,272 INFO __main__ Validation: matthews_correlation = 0.5806473000395166
28
+ 2021-05-25 22:13:44,272 INFO __main__ Updating ckpt
29
+ 2021-05-25 22:13:51,970 INFO torchdistill.misc.log Epoch: [1] [ 0/535] eta: 0:02:25 lr: 1.3320872274143304e-05 sample/s: 14.850208406541903 loss: 0.1800 (0.1800) time: 0.2726 data: 0.0032 max mem: 6789
30
+ 2021-05-25 22:14:04,713 INFO torchdistill.misc.log Epoch: [1] [ 50/535] eta: 0:02:03 lr: 1.2697819314641746e-05 sample/s: 15.958785637372966 loss: 0.1996 (0.2190) time: 0.2530 data: 0.0022 max mem: 6789
31
+ 2021-05-25 22:14:17,311 INFO torchdistill.misc.log Epoch: [1] [100/535] eta: 0:01:50 lr: 1.2074766355140188e-05 sample/s: 15.915004705096683 loss: 0.2385 (0.2362) time: 0.2554 data: 0.0024 max mem: 6789
32
+ 2021-05-25 22:14:30,069 INFO torchdistill.misc.log Epoch: [1] [150/535] eta: 0:01:37 lr: 1.145171339563863e-05 sample/s: 15.867590506617637 loss: 0.1191 (0.2252) time: 0.2559 data: 0.0023 max mem: 6789
33
+ 2021-05-25 22:14:42,525 INFO torchdistill.misc.log Epoch: [1] [200/535] eta: 0:01:24 lr: 1.0828660436137072e-05 sample/s: 15.931447324440123 loss: 0.2200 (0.2283) time: 0.2491 data: 0.0024 max mem: 6789
34
+ 2021-05-25 22:14:55,329 INFO torchdistill.misc.log Epoch: [1] [250/535] eta: 0:01:12 lr: 1.0205607476635516e-05 sample/s: 17.869222027311004 loss: 0.2449 (0.2366) time: 0.2565 data: 0.0022 max mem: 6789
35
+ 2021-05-25 22:15:07,884 INFO torchdistill.misc.log Epoch: [1] [300/535] eta: 0:00:59 lr: 9.582554517133958e-06 sample/s: 18.815330197672253 loss: 0.0952 (0.2291) time: 0.2519 data: 0.0023 max mem: 6789
36
+ 2021-05-25 22:15:20,408 INFO torchdistill.misc.log Epoch: [1] [350/535] eta: 0:00:46 lr: 8.9595015576324e-06 sample/s: 18.9154784279283 loss: 0.1866 (0.2326) time: 0.2561 data: 0.0024 max mem: 6789
37
+ 2021-05-25 22:15:33,181 INFO torchdistill.misc.log Epoch: [1] [400/535] eta: 0:00:34 lr: 8.336448598130842e-06 sample/s: 15.924173926909392 loss: 0.2051 (0.2323) time: 0.2555 data: 0.0022 max mem: 6793
38
+ 2021-05-25 22:15:45,996 INFO torchdistill.misc.log Epoch: [1] [450/535] eta: 0:00:21 lr: 7.713395638629284e-06 sample/s: 15.914974510945969 loss: 0.1959 (0.2370) time: 0.2510 data: 0.0023 max mem: 6793
39
+ 2021-05-25 22:15:58,625 INFO torchdistill.misc.log Epoch: [1] [500/535] eta: 0:00:08 lr: 7.090342679127727e-06 sample/s: 15.935654106628926 loss: 0.2647 (0.2354) time: 0.2522 data: 0.0023 max mem: 6793
40
+ 2021-05-25 22:16:07,126 INFO torchdistill.misc.log Epoch: [1] Total time: 0:02:15
41
+ 2021-05-25 22:16:10,577 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/cola/default_experiment-1-0.arrow
42
+ 2021-05-25 22:16:10,578 INFO __main__ Validation: matthews_correlation = 0.6043989222564181
43
+ 2021-05-25 22:16:10,578 INFO __main__ Updating ckpt
44
+ 2021-05-25 22:16:17,171 INFO torchdistill.misc.log Epoch: [2] [ 0/535] eta: 0:02:25 lr: 6.654205607476636e-06 sample/s: 14.956847949733799 loss: 0.0537 (0.0537) time: 0.2712 data: 0.0037 max mem: 6793
45
+ 2021-05-25 22:16:30,294 INFO torchdistill.misc.log Epoch: [2] [ 50/535] eta: 0:02:07 lr: 6.031152647975078e-06 sample/s: 14.163199286826208 loss: 0.0417 (0.1182) time: 0.2616 data: 0.0023 max mem: 6793
46
+ 2021-05-25 22:16:42,753 INFO torchdistill.misc.log Epoch: [2] [100/535] eta: 0:01:51 lr: 5.408099688473521e-06 sample/s: 18.72304600183244 loss: 0.0036 (0.1272) time: 0.2478 data: 0.0023 max mem: 6793
47
+ 2021-05-25 22:16:55,421 INFO torchdistill.misc.log Epoch: [2] [150/535] eta: 0:01:38 lr: 4.7850467289719636e-06 sample/s: 17.933821981022056 loss: 0.0642 (0.1666) time: 0.2506 data: 0.0023 max mem: 6793
48
+ 2021-05-25 22:17:08,064 INFO torchdistill.misc.log Epoch: [2] [200/535] eta: 0:01:25 lr: 4.1619937694704055e-06 sample/s: 15.78100828969634 loss: 0.0050 (0.1866) time: 0.2558 data: 0.0022 max mem: 6793
49
+ 2021-05-25 22:17:20,695 INFO torchdistill.misc.log Epoch: [2] [250/535] eta: 0:01:12 lr: 3.5389408099688475e-06 sample/s: 15.911321864152804 loss: 0.1082 (0.1935) time: 0.2541 data: 0.0022 max mem: 6793
50
+ 2021-05-25 22:17:33,424 INFO torchdistill.misc.log Epoch: [2] [300/535] eta: 0:00:59 lr: 2.91588785046729e-06 sample/s: 15.906162717276503 loss: 0.0000 (0.2150) time: 0.2512 data: 0.0022 max mem: 6793
51
+ 2021-05-25 22:17:45,850 INFO torchdistill.misc.log Epoch: [2] [350/535] eta: 0:00:46 lr: 2.2928348909657324e-06 sample/s: 15.864574616061812 loss: 0.0202 (0.2241) time: 0.2504 data: 0.0023 max mem: 6793
52
+ 2021-05-25 22:17:58,459 INFO torchdistill.misc.log Epoch: [2] [400/535] eta: 0:00:34 lr: 1.6697819314641748e-06 sample/s: 15.983202467804412 loss: 0.0000 (0.2294) time: 0.2523 data: 0.0024 max mem: 6793
53
+ 2021-05-25 22:18:11,040 INFO torchdistill.misc.log Epoch: [2] [450/535] eta: 0:00:21 lr: 1.046728971962617e-06 sample/s: 15.717480588781966 loss: 0.0068 (0.2289) time: 0.2524 data: 0.0023 max mem: 6793
54
+ 2021-05-25 22:18:23,489 INFO torchdistill.misc.log Epoch: [2] [500/535] eta: 0:00:08 lr: 4.2367601246105923e-07 sample/s: 14.20932160827428 loss: 0.0000 (0.2368) time: 0.2523 data: 0.0023 max mem: 6793
55
+ 2021-05-25 22:18:31,858 INFO torchdistill.misc.log Epoch: [2] Total time: 0:02:14
56
+ 2021-05-25 22:18:35,313 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/cola/default_experiment-1-0.arrow
57
+ 2021-05-25 22:18:35,314 INFO __main__ Validation: matthews_correlation = 0.610638611987945
58
+ 2021-05-25 22:18:35,314 INFO __main__ Updating ckpt
59
+ 2021-05-25 22:18:50,900 INFO __main__ [Student: bert-large-uncased]
60
+ 2021-05-25 22:18:54,369 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/cola/default_experiment-1-0.arrow
61
+ 2021-05-25 22:18:54,369 INFO __main__ Test: matthews_correlation = 0.610638611987945
62
+ 2021-05-25 22:18:54,369 INFO __main__ Start prediction for private dataset(s)
63
+ 2021-05-25 22:18:54,371 INFO __main__ cola/test: 1063 samples