File size: 6,392 Bytes
c94d09f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
2021-05-21 20:55:41,099	INFO	__main__	Namespace(adjust_lr=False, config='torchdistill/configs/sample/glue/mrpc/ce/bert_large_uncased.yaml', log='log/glue/mrpc/ce/bert_large_uncased.txt', private_output='leaderboard/glue/standard/bert_large_uncased/', seed=None, student_only=False, task_name='mrpc', test_only=False, world_size=1)
2021-05-21 20:55:41,134	INFO	__main__	Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Use FP16 precision: True

2021-05-21 20:56:09,963	INFO	__main__	Start training
2021-05-21 20:56:09,964	INFO	torchdistill.models.util	[student model]
2021-05-21 20:56:09,964	INFO	torchdistill.models.util	Using the original student model
2021-05-21 20:56:09,964	INFO	torchdistill.core.training	Loss = 1.0 * OrgLoss
2021-05-21 20:56:13,340	INFO	torchdistill.misc.log	Epoch: [0]  [  0/115]  eta: 0:01:32  lr: 1.996521739130435e-05  sample/s: 5.0108644488568075  loss: 0.7117 (0.7117)  time: 0.8047  data: 0.0065  max mem: 5401
2021-05-21 20:56:57,890	INFO	torchdistill.misc.log	Epoch: [0]  [ 50/115]  eta: 0:00:57  lr: 1.822608695652174e-05  sample/s: 4.658316585970199  loss: 0.6187 (0.6438)  time: 0.8854  data: 0.0047  max mem: 10945
2021-05-21 20:57:42,797	INFO	torchdistill.misc.log	Epoch: [0]  [100/115]  eta: 0:00:13  lr: 1.6486956521739132e-05  sample/s: 4.6675958893857725  loss: 0.6068 (0.6307)  time: 0.8995  data: 0.0046  max mem: 10946
2021-05-21 20:57:55,065	INFO	torchdistill.misc.log	Epoch: [0] Total time: 0:01:42
2021-05-21 20:57:58,697	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/mrpc/default_experiment-1-0.arrow
2021-05-21 20:57:58,698	INFO	__main__	Validation: accuracy = 0.6911764705882353, f1 = 0.8141592920353982
2021-05-21 20:57:58,698	INFO	__main__	Updating ckpt
2021-05-21 20:58:04,096	INFO	torchdistill.misc.log	Epoch: [1]  [  0/115]  eta: 0:01:41  lr: 1.596521739130435e-05  sample/s: 4.550537109441219  loss: 0.6108 (0.6108)  time: 0.8846  data: 0.0056  max mem: 10946
2021-05-21 20:58:48,505	INFO	torchdistill.misc.log	Epoch: [1]  [ 50/115]  eta: 0:00:57  lr: 1.4226086956521742e-05  sample/s: 5.072097714849413  loss: 0.5569 (0.5700)  time: 0.9032  data: 0.0046  max mem: 10946
2021-05-21 20:59:33,555	INFO	torchdistill.misc.log	Epoch: [1]  [100/115]  eta: 0:00:13  lr: 1.2486956521739131e-05  sample/s: 4.004992031655668  loss: 0.5414 (0.5664)  time: 0.8920  data: 0.0046  max mem: 10946
2021-05-21 20:59:45,989	INFO	torchdistill.misc.log	Epoch: [1] Total time: 0:01:42
2021-05-21 20:59:49,619	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/mrpc/default_experiment-1-0.arrow
2021-05-21 20:59:49,620	INFO	__main__	Validation: accuracy = 0.7524509803921569, f1 = 0.8378812199036919
2021-05-21 20:59:49,620	INFO	__main__	Updating ckpt
2021-05-21 20:59:55,254	INFO	torchdistill.misc.log	Epoch: [2]  [  0/115]  eta: 0:01:41  lr: 1.196521739130435e-05  sample/s: 4.561543612792821  loss: 0.4717 (0.4717)  time: 0.8828  data: 0.0059  max mem: 10946
2021-05-21 21:00:39,815	INFO	torchdistill.misc.log	Epoch: [2]  [ 50/115]  eta: 0:00:57  lr: 1.022608695652174e-05  sample/s: 4.008234729448237  loss: 0.5461 (0.5267)  time: 0.8850  data: 0.0046  max mem: 10946
2021-05-21 21:01:24,369	INFO	torchdistill.misc.log	Epoch: [2]  [100/115]  eta: 0:00:13  lr: 8.48695652173913e-06  sample/s: 5.080750089185291  loss: 0.4628 (0.5161)  time: 0.8997  data: 0.0047  max mem: 10946
2021-05-21 21:01:36,665	INFO	torchdistill.misc.log	Epoch: [2] Total time: 0:01:42
2021-05-21 21:01:40,295	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/mrpc/default_experiment-1-0.arrow
2021-05-21 21:01:40,295	INFO	__main__	Validation: accuracy = 0.7696078431372549, f1 = 0.8469055374592835
2021-05-21 21:01:40,295	INFO	__main__	Updating ckpt
2021-05-21 21:01:45,871	INFO	torchdistill.misc.log	Epoch: [3]  [  0/115]  eta: 0:01:41  lr: 7.965217391304349e-06  sample/s: 4.552310192381945  loss: 0.4230 (0.4230)  time: 0.8846  data: 0.0059  max mem: 10946
2021-05-21 21:02:30,881	INFO	torchdistill.misc.log	Epoch: [3]  [ 50/115]  eta: 0:00:58  lr: 6.226086956521739e-06  sample/s: 4.298450072186489  loss: 0.4338 (0.4609)  time: 0.8961  data: 0.0048  max mem: 10946
2021-05-21 21:03:15,449	INFO	torchdistill.misc.log	Epoch: [3]  [100/115]  eta: 0:00:13  lr: 4.486956521739131e-06  sample/s: 4.298780486242516  loss: 0.4697 (0.4590)  time: 0.8995  data: 0.0047  max mem: 10946
2021-05-21 21:03:27,486	INFO	torchdistill.misc.log	Epoch: [3] Total time: 0:01:42
2021-05-21 21:03:31,118	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/mrpc/default_experiment-1-0.arrow
2021-05-21 21:03:31,118	INFO	__main__	Validation: accuracy = 0.7622549019607843, f1 = 0.8186915887850468
2021-05-21 21:03:31,983	INFO	torchdistill.misc.log	Epoch: [4]  [  0/115]  eta: 0:01:39  lr: 3.965217391304348e-06  sample/s: 4.657085579681301  loss: 0.3907 (0.3907)  time: 0.8638  data: 0.0048  max mem: 10946
2021-05-21 21:04:17,067	INFO	torchdistill.misc.log	Epoch: [4]  [ 50/115]  eta: 0:00:58  lr: 2.2260869565217395e-06  sample/s: 4.657376462576989  loss: 0.3632 (0.3954)  time: 0.9041  data: 0.0048  max mem: 10946
2021-05-21 21:05:02,138	INFO	torchdistill.misc.log	Epoch: [4]  [100/115]  eta: 0:00:13  lr: 4.869565217391305e-07  sample/s: 4.656771466962662  loss: 0.4057 (0.3937)  time: 0.8889  data: 0.0047  max mem: 10946
2021-05-21 21:05:14,248	INFO	torchdistill.misc.log	Epoch: [4] Total time: 0:01:43
2021-05-21 21:05:17,878	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/mrpc/default_experiment-1-0.arrow
2021-05-21 21:05:17,879	INFO	__main__	Validation: accuracy = 0.7941176470588235, f1 = 0.8571428571428571
2021-05-21 21:05:17,879	INFO	__main__	Updating ckpt
2021-05-21 21:05:28,554	INFO	__main__	[Student: bert-large-uncased]
2021-05-21 21:05:32,209	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/mrpc/default_experiment-1-0.arrow
2021-05-21 21:05:32,209	INFO	__main__	Test: accuracy = 0.7941176470588235, f1 = 0.8571428571428571
2021-05-21 21:05:32,210	INFO	__main__	Start prediction for private dataset(s)
2021-05-21 21:05:32,211	INFO	__main__	mrpc/test: 1725 samples