File size: 12,910 Bytes
b96d3f6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
2021-06-01 19:42:00,212	INFO	__main__	Namespace(adjust_lr=False, config='torchdistill/configs/sample/glue/qnli/kd/bert_base_uncased_from_bert_large_uncased.yaml', log='log/glue/qnli/kd/bert_base_uncased_from_bert_large_uncased.txt', private_output='leaderboard/glue/kd/bert_base_uncased_from_bert_large_uncased/', seed=None, student_only=False, task_name='qnli', test_only=False, world_size=1)
2021-06-01 19:42:00,284	INFO	__main__	Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Use FP16 precision: True

2021-06-01 19:42:00,637	INFO	filelock	Lock 139934447891088 acquired on /root/.cache/huggingface/transformers/20af61132d3eccaf2a0b4fd9a18767272ba96e31e9a0b1d0035ef88dcdf1825b.92355a00abb9ce3df48e653f7303627944e38f408924eaed14015d4b5ab463c2.lock
2021-06-01 19:42:00,984	INFO	filelock	Lock 139934447891088 released on /root/.cache/huggingface/transformers/20af61132d3eccaf2a0b4fd9a18767272ba96e31e9a0b1d0035ef88dcdf1825b.92355a00abb9ce3df48e653f7303627944e38f408924eaed14015d4b5ab463c2.lock
2021-06-01 19:42:01,689	INFO	filelock	Lock 139934440344528 acquired on /root/.cache/huggingface/transformers/a89adc171199945c88c6380d6e359e2f2f1d9e33fbf2914840ccbcae2a88cc90.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99.lock
2021-06-01 19:42:02,212	INFO	filelock	Lock 139934440344528 released on /root/.cache/huggingface/transformers/a89adc171199945c88c6380d6e359e2f2f1d9e33fbf2914840ccbcae2a88cc90.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99.lock
2021-06-01 19:42:02,562	INFO	filelock	Lock 139934440344528 acquired on /root/.cache/huggingface/transformers/e40cf262b8ee3363b76e16e6b8d7a7cff091ce90d0df585990d20879a18b62d6.6dc9f54d5893dc361ac6ccee1865622847ad90bf0536eeb2043f3e3e2f41078a.lock
2021-06-01 19:42:03,100	INFO	filelock	Lock 139934440344528 released on /root/.cache/huggingface/transformers/e40cf262b8ee3363b76e16e6b8d7a7cff091ce90d0df585990d20879a18b62d6.6dc9f54d5893dc361ac6ccee1865622847ad90bf0536eeb2043f3e3e2f41078a.lock
2021-06-01 19:42:03,804	INFO	filelock	Lock 139934407962256 acquired on /root/.cache/huggingface/transformers/8af4ffda1cdd3bddb5fdcc1e67501f432ccc42db734ab6108c0013a2f653330e.dd8bd9bfd3664b530ea4e645105f557769387b3da9f79bdb55ed556bdd80611d.lock
2021-06-01 19:42:04,154	INFO	filelock	Lock 139934407962256 released on /root/.cache/huggingface/transformers/8af4ffda1cdd3bddb5fdcc1e67501f432ccc42db734ab6108c0013a2f653330e.dd8bd9bfd3664b530ea4e645105f557769387b3da9f79bdb55ed556bdd80611d.lock
2021-06-01 19:42:04,504	INFO	filelock	Lock 139934407988304 acquired on /root/.cache/huggingface/transformers/ceadc7427e9956e9cfd059a42937d2e2622d3da1f2c317b34b12dfc4a7d55d3b.0f95f2171d2c33a9e9e088c1e5decb2dfb3a22fb00d904f96183827da9540426.lock
2021-06-01 19:42:04,853	INFO	filelock	Lock 139934407988304 released on /root/.cache/huggingface/transformers/ceadc7427e9956e9cfd059a42937d2e2622d3da1f2c317b34b12dfc4a7d55d3b.0f95f2171d2c33a9e9e088c1e5decb2dfb3a22fb00d904f96183827da9540426.lock
2021-06-01 19:42:05,233	INFO	filelock	Lock 139934407943440 acquired on /root/.cache/huggingface/transformers/e9885efd6faed99edb75942a797fb277852d1c5e496344391594af18bdb6aa75.142b7305af89599e15e0af127ab907dbd903a36257191ae84062060491593a1e.lock
2021-06-01 19:42:28,226	INFO	filelock	Lock 139934407943440 released on /root/.cache/huggingface/transformers/e9885efd6faed99edb75942a797fb277852d1c5e496344391594af18bdb6aa75.142b7305af89599e15e0af127ab907dbd903a36257191ae84062060491593a1e.lock
2021-06-01 19:42:32,093	INFO	filelock	Lock 139934258577808 acquired on /root/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e.lock
2021-06-01 19:42:32,444	INFO	filelock	Lock 139934258577808 released on /root/.cache/huggingface/transformers/3c61d016573b14f7f008c02c4e51a366c67ab274726fe2910691e2a761acf43e.37395cee442ab11005bcd270f3c34464dc1704b715b5d7d52b1a461abe3b9e4e.lock
2021-06-01 19:42:33,173	INFO	filelock	Lock 139934258580752 acquired on /root/.cache/huggingface/transformers/45c3f7a79a80e1cf0a489e5c62b43f173c15db47864303a55d623bb3c96f72a5.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99.lock
2021-06-01 19:42:33,683	INFO	filelock	Lock 139934258580752 released on /root/.cache/huggingface/transformers/45c3f7a79a80e1cf0a489e5c62b43f173c15db47864303a55d623bb3c96f72a5.d789d64ebfe299b0e416afc4a169632f903f693095b4629a7ea271d5a0cf2c99.lock
2021-06-01 19:42:34,032	INFO	filelock	Lock 139934258102032 acquired on /root/.cache/huggingface/transformers/534479488c54aeaf9c3406f647aa2ec13648c06771ffe269edabebd4c412da1d.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4.lock
2021-06-01 19:42:34,565	INFO	filelock	Lock 139934258102032 released on /root/.cache/huggingface/transformers/534479488c54aeaf9c3406f647aa2ec13648c06771ffe269edabebd4c412da1d.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4.lock
2021-06-01 19:42:35,616	INFO	filelock	Lock 139934258119312 acquired on /root/.cache/huggingface/transformers/c1d7f0a763fb63861cc08553866f1fc3e5a6f4f07621be277452d26d71303b7e.20430bd8e10ef77a7d2977accefe796051e01bc2fc4aa146bc862997a1a15e79.lock
2021-06-01 19:42:35,975	INFO	filelock	Lock 139934258119312 released on /root/.cache/huggingface/transformers/c1d7f0a763fb63861cc08553866f1fc3e5a6f4f07621be277452d26d71303b7e.20430bd8e10ef77a7d2977accefe796051e01bc2fc4aa146bc862997a1a15e79.lock
2021-06-01 19:42:36,361	INFO	filelock	Lock 139934258578064 acquired on /root/.cache/huggingface/transformers/a8041bf617d7f94ea26d15e218abd04afc2004805632abc0ed2066aa16d50d04.faf6ea826ae9c5867d12b22257f9877e6b8367890837bd60f7c54a29633f7f2f.lock
2021-06-01 19:42:43,593	INFO	filelock	Lock 139934258578064 released on /root/.cache/huggingface/transformers/a8041bf617d7f94ea26d15e218abd04afc2004805632abc0ed2066aa16d50d04.faf6ea826ae9c5867d12b22257f9877e6b8367890837bd60f7c54a29633f7f2f.lock
2021-06-01 19:43:07,922	INFO	__main__	Start training
2021-06-01 19:43:07,922	INFO	torchdistill.models.util	[teacher model]
2021-06-01 19:43:07,922	INFO	torchdistill.models.util	Using the original teacher model
2021-06-01 19:43:07,922	INFO	torchdistill.models.util	[student model]
2021-06-01 19:43:07,922	INFO	torchdistill.models.util	Using the original student model
2021-06-01 19:43:07,923	INFO	torchdistill.core.distillation	Loss = 1.0 * OrgLoss
2021-06-01 19:43:07,923	INFO	torchdistill.core.distillation	Freezing the whole teacher model
2021-06-01 19:43:14,731	INFO	torchdistill.misc.log	Epoch: [0]  [   0/3274]  eta: 0:36:18  lr: 4.999490938709021e-05  sample/s: 6.268329780814906  loss: 0.6479 (0.6479)  time: 0.6655  data: 0.0274  max mem: 3034
2021-06-01 19:48:36,357	INFO	torchdistill.misc.log	Epoch: [0]  [ 500/3274]  eta: 0:29:44  lr: 4.7449602932193036e-05  sample/s: 6.540947354960249  loss: 0.3303 (0.4136)  time: 0.6375  data: 0.0044  max mem: 5082
2021-06-01 19:53:57,989	INFO	torchdistill.misc.log	Epoch: [0]  [1000/3274]  eta: 0:24:22  lr: 4.490429647729587e-05  sample/s: 6.062278385692094  loss: 0.2386 (0.3724)  time: 0.6654  data: 0.0046  max mem: 5082
2021-06-01 19:59:20,881	INFO	torchdistill.misc.log	Epoch: [0]  [1500/3274]  eta: 0:19:02  lr: 4.23589900223987e-05  sample/s: 7.087861396219321  loss: 0.3091 (0.3486)  time: 0.6669  data: 0.0044  max mem: 5082
2021-06-01 20:04:43,612	INFO	torchdistill.misc.log	Epoch: [0]  [2000/3274]  eta: 0:13:41  lr: 3.981368356750153e-05  sample/s: 6.062981631682493  loss: 0.2964 (0.3324)  time: 0.6403  data: 0.0045  max mem: 5082
2021-06-01 20:10:04,214	INFO	torchdistill.misc.log	Epoch: [0]  [2500/3274]  eta: 0:08:18  lr: 3.726837711260436e-05  sample/s: 6.070009898869445  loss: 0.2783 (0.3190)  time: 0.6220  data: 0.0045  max mem: 5082
2021-06-01 20:15:25,625	INFO	torchdistill.misc.log	Epoch: [0]  [3000/3274]  eta: 0:02:56  lr: 3.472307065770719e-05  sample/s: 6.05868802221069  loss: 0.2463 (0.3135)  time: 0.6771  data: 0.0045  max mem: 5082
2021-06-01 20:18:20,347	INFO	torchdistill.misc.log	Epoch: [0] Total time: 0:35:06
2021-06-01 20:18:40,297	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/qnli/default_experiment-1-0.arrow
2021-06-01 20:18:40,297	INFO	__main__	Validation: accuracy = 0.90536335346879
2021-06-01 20:18:40,298	INFO	__main__	Updating ckpt at ./resource/ckpt/glue/qnli/kd/qnli-bert-base-uncased_from_bert-large-uncased
2021-06-01 20:18:42,209	INFO	torchdistill.misc.log	Epoch: [1]  [   0/3274]  eta: 0:32:03  lr: 3.332824272042354e-05  sample/s: 7.010533423814029  loss: 0.0507 (0.0507)  time: 0.5874  data: 0.0169  max mem: 5082
2021-06-01 20:24:02,784	INFO	torchdistill.misc.log	Epoch: [1]  [ 500/3274]  eta: 0:29:38  lr: 3.078293626552637e-05  sample/s: 7.086708737782058  loss: 0.1098 (0.1673)  time: 0.5952  data: 0.0044  max mem: 5082
2021-06-01 20:29:21,701	INFO	torchdistill.misc.log	Epoch: [1]  [1000/3274]  eta: 0:24:14  lr: 2.82376298106292e-05  sample/s: 6.536556439751819  loss: 0.1207 (0.1678)  time: 0.6259  data: 0.0046  max mem: 5082
2021-06-01 20:34:42,169	INFO	torchdistill.misc.log	Epoch: [1]  [1500/3274]  eta: 0:18:55  lr: 2.569232335573203e-05  sample/s: 6.5383523579776845  loss: 0.0869 (0.1672)  time: 0.6195  data: 0.0043  max mem: 5082
2021-06-01 20:40:06,680	INFO	torchdistill.misc.log	Epoch: [1]  [2000/3274]  eta: 0:13:38  lr: 2.314701690083486e-05  sample/s: 7.665929342032576  loss: 0.1216 (0.1649)  time: 0.6682  data: 0.0044  max mem: 5082
2021-06-01 20:45:33,102	INFO	torchdistill.misc.log	Epoch: [1]  [2500/3274]  eta: 0:08:18  lr: 2.060171044593769e-05  sample/s: 7.08687637389096  loss: 0.1018 (0.1626)  time: 0.6454  data: 0.0043  max mem: 5082
2021-06-01 20:50:54,681	INFO	torchdistill.misc.log	Epoch: [1]  [3000/3274]  eta: 0:02:56  lr: 1.8056403991040522e-05  sample/s: 7.090452503412688  loss: 0.0853 (0.1607)  time: 0.6372  data: 0.0044  max mem: 5082
2021-06-01 20:53:50,080	INFO	torchdistill.misc.log	Epoch: [1] Total time: 0:35:08
2021-06-01 20:54:10,003	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/qnli/default_experiment-1-0.arrow
2021-06-01 20:54:10,004	INFO	__main__	Validation: accuracy = 0.9178107267069375
2021-06-01 20:54:10,004	INFO	__main__	Updating ckpt at ./resource/ckpt/glue/qnli/kd/qnli-bert-base-uncased_from_bert-large-uncased
2021-06-01 20:54:11,975	INFO	torchdistill.misc.log	Epoch: [2]  [   0/3274]  eta: 0:32:07  lr: 1.6661576053756872e-05  sample/s: 6.9625433519294795  loss: 0.0285 (0.0285)  time: 0.5887  data: 0.0142  max mem: 5082
2021-06-01 20:59:31,890	INFO	torchdistill.misc.log	Epoch: [2]  [ 500/3274]  eta: 0:29:34  lr: 1.4116269598859703e-05  sample/s: 6.541477823907571  loss: 0.0000 (0.0754)  time: 0.6146  data: 0.0045  max mem: 5082
2021-06-01 21:04:51,984	INFO	torchdistill.misc.log	Epoch: [2]  [1000/3274]  eta: 0:24:15  lr: 1.1570963143962533e-05  sample/s: 7.666114992529096  loss: 0.0842 (0.1092)  time: 0.6460  data: 0.0046  max mem: 5082
2021-06-01 21:10:11,480	INFO	torchdistill.misc.log	Epoch: [2]  [1500/3274]  eta: 0:18:54  lr: 9.025656689065364e-06  sample/s: 5.0033478488296845  loss: 0.0114 (0.1235)  time: 0.6660  data: 0.0044  max mem: 5082
2021-06-01 21:15:33,085	INFO	torchdistill.misc.log	Epoch: [2]  [2000/3274]  eta: 0:13:36  lr: 6.480350234168193e-06  sample/s: 5.000420546137726  loss: 0.0001 (0.1282)  time: 0.6606  data: 0.0044  max mem: 5082
2021-06-01 21:20:54,238	INFO	torchdistill.misc.log	Epoch: [2]  [2500/3274]  eta: 0:08:16  lr: 3.935043779271024e-06  sample/s: 7.088882635439985  loss: 0.0000 (0.1301)  time: 0.5969  data: 0.0042  max mem: 5082
2021-06-01 21:26:15,307	INFO	torchdistill.misc.log	Epoch: [2]  [3000/3274]  eta: 0:02:55  lr: 1.3897373243738546e-06  sample/s: 6.539550185149094  loss: 0.0000 (0.1303)  time: 0.6694  data: 0.0045  max mem: 5082
2021-06-01 21:29:10,621	INFO	torchdistill.misc.log	Epoch: [2] Total time: 0:34:59
2021-06-01 21:29:30,567	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/qnli/default_experiment-1-0.arrow
2021-06-01 21:29:30,567	INFO	__main__	Validation: accuracy = 0.9101226432363171
2021-06-01 21:29:30,604	INFO	__main__	[Teacher: bert-large-uncased]
2021-06-01 21:30:27,885	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/qnli/default_experiment-1-0.arrow
2021-06-01 21:30:27,886	INFO	__main__	Test: accuracy = 0.9223869668680212
2021-06-01 21:30:31,466	INFO	__main__	[Student: bert-base-uncased]
2021-06-01 21:30:51,406	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/qnli/default_experiment-1-0.arrow
2021-06-01 21:30:51,406	INFO	__main__	Test: accuracy = 0.9178107267069375
2021-06-01 21:30:51,406	INFO	__main__	Start prediction for private dataset(s)
2021-06-01 21:30:51,407	INFO	__main__	qnli/test: 5463 samples