yoshitomo-matsubara commited on
Commit
808baa0
1 Parent(s): 0a8085f

tuned hyperparameters

Browse files
Files changed (3) hide show
  1. pytorch_model.bin +1 -1
  2. tokenizer.json +0 -0
  3. training.log +35 -35
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8ba098155540bea2e11a013969a85955f2f9e761ca8dba92ab29177448cc4126
3
  size 1340742729
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbd044aa0062f8c65fa5bf0bc36e412c368dfd7ebc3ceb8665f887c59e1041ff
3
  size 1340742729
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
training.log CHANGED
@@ -1,41 +1,41 @@
1
- 2021-05-21 21:05:54,150 INFO __main__ Namespace(adjust_lr=False, config='torchdistill/configs/sample/glue/stsb/mse/bert_large_uncased.yaml', log='log/glue/stsb/mse/bert_large_uncased.txt', private_output='leaderboard/glue/standard/bert_large_uncased/', seed=None, student_only=False, task_name='stsb', test_only=False, world_size=1)
2
- 2021-05-21 21:05:54,184 INFO __main__ Distributed environment: NO
3
  Num processes: 1
4
  Process index: 0
5
  Local process index: 0
6
  Device: cuda
7
  Use FP16 precision: True
8
 
9
- 2021-05-21 21:06:08,347 INFO __main__ Start training
10
- 2021-05-21 21:06:08,348 INFO torchdistill.models.util [student model]
11
- 2021-05-21 21:06:08,348 INFO torchdistill.models.util Using the original student model
12
- 2021-05-21 21:06:08,348 INFO torchdistill.core.training Loss = 1.0 * OrgLoss
13
- 2021-05-21 21:06:11,614 INFO torchdistill.misc.log Epoch: [0] [ 0/180] eta: 0:01:58 lr: 1.9962962962962963e-05 sample/s: 6.150729612978653 loss: 11.0822 (11.0822) time: 0.6556 data: 0.0052 max mem: 4716
14
- 2021-05-21 21:06:50,634 INFO torchdistill.misc.log Epoch: [0] [ 50/180] eta: 0:01:41 lr: 1.8111111111111112e-05 sample/s: 4.650817812095187 loss: 5.7362 (7.8091) time: 0.8036 data: 0.0036 max mem: 10866
15
- 2021-05-21 21:07:31,077 INFO torchdistill.misc.log Epoch: [0] [100/180] eta: 0:01:03 lr: 1.625925925925926e-05 sample/s: 5.073427520805109 loss: 1.5491 (5.1515) time: 0.7817 data: 0.0037 max mem: 10866
16
- 2021-05-21 21:08:12,096 INFO torchdistill.misc.log Epoch: [0] [150/180] eta: 0:00:24 lr: 1.4407407407407407e-05 sample/s: 4.66012418875246 loss: 0.6607 (3.6948) time: 0.8202 data: 0.0037 max mem: 12375
17
- 2021-05-21 21:08:33,652 INFO torchdistill.misc.log Epoch: [0] Total time: 0:02:22
18
- 2021-05-21 21:08:42,643 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/stsb/default_experiment-1-0.arrow
19
- 2021-05-21 21:08:42,644 INFO __main__ Validation: pearson = 0.869070919965039, spearmanr = 0.8715366411318595
20
- 2021-05-21 21:08:42,644 INFO __main__ Updating ckpt
21
- 2021-05-21 21:08:48,054 INFO torchdistill.misc.log Epoch: [1] [ 0/180] eta: 0:02:38 lr: 1.3296296296296298e-05 sample/s: 4.560549161662958 loss: 0.5407 (0.5407) time: 0.8833 data: 0.0062 max mem: 12375
22
- 2021-05-21 21:09:27,646 INFO torchdistill.misc.log Epoch: [1] [ 50/180] eta: 0:01:43 lr: 1.1444444444444444e-05 sample/s: 5.076204414368103 loss: 0.3929 (0.4579) time: 0.7556 data: 0.0038 max mem: 12375
23
- 2021-05-21 21:10:06,919 INFO torchdistill.misc.log Epoch: [1] [100/180] eta: 0:01:03 lr: 9.592592592592593e-06 sample/s: 5.071968912559318 loss: 0.3982 (0.4221) time: 0.8143 data: 0.0039 max mem: 12375
24
- 2021-05-21 21:10:47,745 INFO torchdistill.misc.log Epoch: [1] [150/180] eta: 0:00:23 lr: 7.74074074074074e-06 sample/s: 5.072432017980795 loss: 0.3499 (0.4130) time: 0.8114 data: 0.0038 max mem: 12375
25
- 2021-05-21 21:11:09,859 INFO torchdistill.misc.log Epoch: [1] Total time: 0:02:22
26
- 2021-05-21 21:11:18,716 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/stsb/default_experiment-1-0.arrow
27
- 2021-05-21 21:11:18,716 INFO __main__ Validation: pearson = 0.8958498997945837, spearmanr = 0.8947816880373578
28
- 2021-05-21 21:11:18,716 INFO __main__ Updating ckpt
29
- 2021-05-21 21:11:24,308 INFO torchdistill.misc.log Epoch: [2] [ 0/180] eta: 0:02:26 lr: 6.62962962962963e-06 sample/s: 4.964641330957722 loss: 0.4758 (0.4758) time: 0.8114 data: 0.0057 max mem: 12375
30
- 2021-05-21 21:12:04,343 INFO torchdistill.misc.log Epoch: [2] [ 50/180] eta: 0:01:44 lr: 4.777777777777778e-06 sample/s: 5.660394215048351 loss: 0.2219 (0.2284) time: 0.7505 data: 0.0037 max mem: 12375
31
- 2021-05-21 21:12:44,799 INFO torchdistill.misc.log Epoch: [2] [100/180] eta: 0:01:04 lr: 2.9259259259259257e-06 sample/s: 5.058339142243115 loss: 0.1955 (0.2248) time: 0.7775 data: 0.0037 max mem: 12375
32
- 2021-05-21 21:13:24,530 INFO torchdistill.misc.log Epoch: [2] [150/180] eta: 0:00:24 lr: 1.074074074074074e-06 sample/s: 5.060421749670249 loss: 0.2013 (0.2159) time: 0.7998 data: 0.0037 max mem: 12375
33
- 2021-05-21 21:13:47,122 INFO torchdistill.misc.log Epoch: [2] Total time: 0:02:23
34
- 2021-05-21 21:13:55,981 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/stsb/default_experiment-1-0.arrow
35
- 2021-05-21 21:13:55,982 INFO __main__ Validation: pearson = 0.8982067559574533, spearmanr = 0.8962432413562028
36
- 2021-05-21 21:13:55,982 INFO __main__ Updating ckpt
37
- 2021-05-21 21:14:06,537 INFO __main__ [Student: bert-large-uncased]
38
- 2021-05-21 21:14:15,405 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/stsb/default_experiment-1-0.arrow
39
- 2021-05-21 21:14:15,405 INFO __main__ Test: pearson = 0.8982067559574533, spearmanr = 0.8962432413562028
40
- 2021-05-21 21:14:15,405 INFO __main__ Start prediction for private dataset(s)
41
- 2021-05-21 21:14:15,407 INFO __main__ stsb/test: 1379 samples
1
+ 2021-05-25 20:17:27,706 INFO __main__ Namespace(adjust_lr=False, config='torchdistill/configs/sample/glue/stsb/mse/bert_large_uncased.yaml', log='log/glue/stsb/mse/bert_large_uncased.txt', private_output='leaderboard/glue/standard/bert_large_uncased/', seed=None, student_only=False, task_name='stsb', test_only=False, world_size=1)
2
+ 2021-05-25 20:17:27,744 INFO __main__ Distributed environment: NO
3
  Num processes: 1
4
  Process index: 0
5
  Local process index: 0
6
  Device: cuda
7
  Use FP16 precision: True
8
 
9
+ 2021-05-25 20:17:37,444 INFO __main__ Start training
10
+ 2021-05-25 20:17:37,444 INFO torchdistill.models.util [student model]
11
+ 2021-05-25 20:17:37,445 INFO torchdistill.models.util Using the original student model
12
+ 2021-05-25 20:17:37,445 INFO torchdistill.core.training Loss = 1.0 * OrgLoss
13
+ 2021-05-25 20:17:44,656 INFO torchdistill.misc.log Epoch: [0] [ 0/180] eta: 0:02:54 lr: 2.9944444444444443e-05 sample/s: 4.157108273720882 loss: 12.1362 (12.1362) time: 0.9706 data: 0.0084 max mem: 6148
14
+ 2021-05-25 20:18:24,842 INFO torchdistill.misc.log Epoch: [0] [ 50/180] eta: 0:01:44 lr: 2.716666666666667e-05 sample/s: 5.079776319530379 loss: 7.2100 (10.7919) time: 0.7976 data: 0.0035 max mem: 10858
15
+ 2021-05-25 20:19:03,758 INFO torchdistill.misc.log Epoch: [0] [100/180] eta: 0:01:03 lr: 2.438888888888889e-05 sample/s: 5.071751190230392 loss: 0.7149 (6.2958) time: 0.7898 data: 0.0039 max mem: 10858
16
+ 2021-05-25 20:19:42,944 INFO torchdistill.misc.log Epoch: [0] [150/180] eta: 0:00:23 lr: 2.161111111111111e-05 sample/s: 4.658769325438214 loss: 0.5977 (4.4207) time: 0.7743 data: 0.0036 max mem: 10861
17
+ 2021-05-25 20:20:07,011 INFO torchdistill.misc.log Epoch: [0] Total time: 0:02:23
18
+ 2021-05-25 20:20:15,900 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/stsb/default_experiment-1-0.arrow
19
+ 2021-05-25 20:20:15,900 INFO __main__ Validation: pearson = 0.8737826052062094, spearmanr = 0.8735916113223366
20
+ 2021-05-25 20:20:15,900 INFO __main__ Updating ckpt
21
+ 2021-05-25 20:20:21,438 INFO torchdistill.misc.log Epoch: [1] [ 0/180] eta: 0:01:58 lr: 1.9944444444444447e-05 sample/s: 6.128770842374588 loss: 0.4325 (0.4325) time: 0.6593 data: 0.0067 max mem: 10861
22
+ 2021-05-25 20:21:00,988 INFO torchdistill.misc.log Epoch: [1] [ 50/180] eta: 0:01:42 lr: 1.7166666666666666e-05 sample/s: 4.297354562406508 loss: 0.3274 (0.3845) time: 0.8074 data: 0.0038 max mem: 10861
23
+ 2021-05-25 20:21:41,487 INFO torchdistill.misc.log Epoch: [1] [100/180] eta: 0:01:03 lr: 1.438888888888889e-05 sample/s: 7.083348708467611 loss: 0.3129 (0.3857) time: 0.8153 data: 0.0037 max mem: 12382
24
+ 2021-05-25 20:22:21,324 INFO torchdistill.misc.log Epoch: [1] [150/180] eta: 0:00:23 lr: 1.161111111111111e-05 sample/s: 6.354031294441572 loss: 0.2832 (0.3657) time: 0.8022 data: 0.0038 max mem: 12382
25
+ 2021-05-25 20:22:43,475 INFO torchdistill.misc.log Epoch: [1] Total time: 0:02:22
26
+ 2021-05-25 20:22:52,330 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/stsb/default_experiment-1-0.arrow
27
+ 2021-05-25 20:22:52,330 INFO __main__ Validation: pearson = 0.8993742147980403, spearmanr = 0.8971063152009764
28
+ 2021-05-25 20:22:52,331 INFO __main__ Updating ckpt
29
+ 2021-05-25 20:22:58,258 INFO torchdistill.misc.log Epoch: [2] [ 0/180] eta: 0:02:25 lr: 9.944444444444445e-06 sample/s: 4.98979598816883 loss: 0.1504 (0.1504) time: 0.8066 data: 0.0050 max mem: 12382
30
+ 2021-05-25 20:23:37,330 INFO torchdistill.misc.log Epoch: [2] [ 50/180] eta: 0:01:41 lr: 7.166666666666667e-06 sample/s: 5.076305784711369 loss: 0.1316 (0.1620) time: 0.7998 data: 0.0037 max mem: 12382
31
+ 2021-05-25 20:24:18,428 INFO torchdistill.misc.log Epoch: [2] [100/180] eta: 0:01:04 lr: 4.388888888888889e-06 sample/s: 4.601912661611403 loss: 0.1340 (0.1553) time: 0.8041 data: 0.0037 max mem: 12382
32
+ 2021-05-25 20:24:57,662 INFO torchdistill.misc.log Epoch: [2] [150/180] eta: 0:00:23 lr: 1.6111111111111111e-06 sample/s: 5.072415148418462 loss: 0.1245 (0.1507) time: 0.7877 data: 0.0038 max mem: 12382
33
+ 2021-05-25 20:25:20,536 INFO torchdistill.misc.log Epoch: [2] Total time: 0:02:23
34
+ 2021-05-25 20:25:29,396 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/stsb/default_experiment-1-0.arrow
35
+ 2021-05-25 20:25:29,397 INFO __main__ Validation: pearson = 0.9034122016001204, spearmanr = 0.9010440275420903
36
+ 2021-05-25 20:25:29,397 INFO __main__ Updating ckpt
37
+ 2021-05-25 20:25:40,164 INFO __main__ [Student: bert-large-uncased]
38
+ 2021-05-25 20:25:49,037 INFO /usr/local/lib/python3.7/dist-packages/datasets/metric.py Removing /root/.cache/huggingface/metrics/glue/stsb/default_experiment-1-0.arrow
39
+ 2021-05-25 20:25:49,038 INFO __main__ Test: pearson = 0.9034122016001204, spearmanr = 0.9010440275420903
40
+ 2021-05-25 20:25:49,038 INFO __main__ Start prediction for private dataset(s)
41
+ 2021-05-25 20:25:49,039 INFO __main__ stsb/test: 1379 samples