File size: 17,558 Bytes
6ac7d79
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
2021-05-28 21:41:37,358	INFO	__main__	Namespace(adjust_lr=False, config='torchdistill/configs/sample/glue/qqp/ce/bert_base_uncased.yaml', log='log/glue/qqp/ce/bert_base_uncased.txt', private_output='leaderboard/glue/standard/bert_base_uncased/', seed=None, student_only=False, task_name='qqp', test_only=False, world_size=1)
2021-05-28 21:41:37,386	INFO	__main__	Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Use FP16 precision: True

2021-05-28 21:41:42,076	WARNING	datasets.builder	Reusing dataset glue (/root/.cache/huggingface/datasets/glue/qqp/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
2021-05-28 21:42:41,913	INFO	__main__	Start training
2021-05-28 21:42:41,913	INFO	torchdistill.models.util	[student model]
2021-05-28 21:42:41,914	INFO	torchdistill.models.util	Using the original student model
2021-05-28 21:42:41,914	INFO	torchdistill.core.training	Loss = 1.0 * OrgLoss
2021-05-28 21:42:44,608	INFO	torchdistill.misc.log	Epoch: [0]  [    0/22741]  eta: 1:02:12  lr: 4.999926710933263e-05  sample/s: 28.09482152306569  loss: 0.5684 (0.5684)  time: 0.1641  data: 0.0218  max mem: 1891
2021-05-28 21:45:02,079	INFO	torchdistill.misc.log	Epoch: [0]  [ 1000/22741]  eta: 0:49:49  lr: 4.926637644196239e-05  sample/s: 22.046275952693826  loss: 0.3256 (0.4391)  time: 0.1462  data: 0.0020  max mem: 3206
2021-05-28 21:47:18,645	INFO	torchdistill.misc.log	Epoch: [0]  [ 2000/22741]  eta: 0:47:22  lr: 4.8533485774592146e-05  sample/s: 23.55741774597576  loss: 0.3501 (0.3987)  time: 0.1345  data: 0.0020  max mem: 3206
2021-05-28 21:49:34,213	INFO	torchdistill.misc.log	Epoch: [0]  [ 3000/22741]  eta: 0:44:55  lr: 4.780059510722191e-05  sample/s: 32.71120261889025  loss: 0.2823 (0.3815)  time: 0.1361  data: 0.0019  max mem: 3206
2021-05-28 21:51:49,521	INFO	torchdistill.misc.log	Epoch: [0]  [ 4000/22741]  eta: 0:42:33  lr: 4.706770443985167e-05  sample/s: 25.242523407373305  loss: 0.3737 (0.3665)  time: 0.1319  data: 0.0019  max mem: 3206
2021-05-28 21:54:05,811	INFO	torchdistill.misc.log	Epoch: [0]  [ 5000/22741]  eta: 0:40:17  lr: 4.633481377248142e-05  sample/s: 32.662548451970494  loss: 0.2040 (0.3526)  time: 0.1379  data: 0.0021  max mem: 3206
2021-05-28 21:56:21,633	INFO	torchdistill.misc.log	Epoch: [0]  [ 6000/22741]  eta: 0:37:59  lr: 4.560192310511118e-05  sample/s: 17.64687147306407  loss: 0.2610 (0.3427)  time: 0.1330  data: 0.0019  max mem: 3206
2021-05-28 21:58:37,550	INFO	torchdistill.misc.log	Epoch: [0]  [ 7000/22741]  eta: 0:35:42  lr: 4.4869032437740936e-05  sample/s: 39.795854661726544  loss: 0.2279 (0.3345)  time: 0.1368  data: 0.0019  max mem: 3206
2021-05-28 22:00:54,589	INFO	torchdistill.misc.log	Epoch: [0]  [ 8000/22741]  eta: 0:33:28  lr: 4.41361417703707e-05  sample/s: 29.80756185917765  loss: 0.3037 (0.3276)  time: 0.1373  data: 0.0020  max mem: 3206
2021-05-28 22:03:09,969	INFO	torchdistill.misc.log	Epoch: [0]  [ 9000/22741]  eta: 0:31:10  lr: 4.340325110300046e-05  sample/s: 31.53214914663003  loss: 0.1971 (0.3208)  time: 0.1426  data: 0.0020  max mem: 3206
2021-05-28 22:05:26,598	INFO	torchdistill.misc.log	Epoch: [0]  [10000/22741]  eta: 0:28:55  lr: 4.267036043563021e-05  sample/s: 32.66242127498029  loss: 0.2713 (0.3154)  time: 0.1431  data: 0.0019  max mem: 3206
2021-05-28 22:07:42,509	INFO	torchdistill.misc.log	Epoch: [0]  [11000/22741]  eta: 0:26:38  lr: 4.193746976825997e-05  sample/s: 39.87350538666844  loss: 0.2716 (0.3113)  time: 0.1377  data: 0.0020  max mem: 3206
2021-05-28 22:09:59,798	INFO	torchdistill.misc.log	Epoch: [0]  [12000/22741]  eta: 0:24:23  lr: 4.1204579100889726e-05  sample/s: 29.81990622411654  loss: 0.1884 (0.3089)  time: 0.1430  data: 0.0020  max mem: 3206
2021-05-28 22:12:15,437	INFO	torchdistill.misc.log	Epoch: [0]  [13000/22741]  eta: 0:22:06  lr: 4.047168843351949e-05  sample/s: 27.23596133085496  loss: 0.2587 (0.3061)  time: 0.1314  data: 0.0020  max mem: 3206
2021-05-28 22:14:30,715	INFO	torchdistill.misc.log	Epoch: [0]  [14000/22741]  eta: 0:19:50  lr: 3.973879776614925e-05  sample/s: 27.213695399343713  loss: 0.2133 (0.3033)  time: 0.1350  data: 0.0019  max mem: 3206
2021-05-28 22:16:46,940	INFO	torchdistill.misc.log	Epoch: [0]  [15000/22741]  eta: 0:17:33  lr: 3.9005907098779e-05  sample/s: 32.61448261114675  loss: 0.1779 (0.3003)  time: 0.1352  data: 0.0019  max mem: 3206
2021-05-28 22:19:01,833	INFO	torchdistill.misc.log	Epoch: [0]  [16000/22741]  eta: 0:15:17  lr: 3.827301643140876e-05  sample/s: 29.83268519160634  loss: 0.2660 (0.2970)  time: 0.1310  data: 0.0020  max mem: 3206
2021-05-28 22:21:16,379	INFO	torchdistill.misc.log	Epoch: [0]  [17000/22741]  eta: 0:13:00  lr: 3.754012576403852e-05  sample/s: 32.618984755190645  loss: 0.2141 (0.2938)  time: 0.1298  data: 0.0019  max mem: 3206
2021-05-28 22:23:31,336	INFO	torchdistill.misc.log	Epoch: [0]  [18000/22741]  eta: 0:10:44  lr: 3.680723509666828e-05  sample/s: 26.997843676178093  loss: 0.2866 (0.2914)  time: 0.1386  data: 0.0020  max mem: 3206
2021-05-28 22:25:46,116	INFO	torchdistill.misc.log	Epoch: [0]  [19000/22741]  eta: 0:08:28  lr: 3.607434442929804e-05  sample/s: 44.86558415163768  loss: 0.2532 (0.2893)  time: 0.1327  data: 0.0020  max mem: 3206
2021-05-28 22:28:02,199	INFO	torchdistill.misc.log	Epoch: [0]  [20000/22741]  eta: 0:06:12  lr: 3.53414537619278e-05  sample/s: 23.53072124231408  loss: 0.1963 (0.2868)  time: 0.1398  data: 0.0019  max mem: 3206
2021-05-28 22:30:17,943	INFO	torchdistill.misc.log	Epoch: [0]  [21000/22741]  eta: 0:03:56  lr: 3.460856309455756e-05  sample/s: 29.843086123508265  loss: 0.1552 (0.2842)  time: 0.1418  data: 0.0019  max mem: 3206
2021-05-28 22:32:32,428	INFO	torchdistill.misc.log	Epoch: [0]  [22000/22741]  eta: 0:01:40  lr: 3.387567242718731e-05  sample/s: 29.88486894254491  loss: 0.1958 (0.2824)  time: 0.1364  data: 0.0019  max mem: 3206
2021-05-28 22:34:12,553	INFO	torchdistill.misc.log	Epoch: [0] Total time: 0:51:28
2021-05-28 22:35:52,473	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/qqp/default_experiment-1-0.arrow
2021-05-28 22:35:52,475	INFO	__main__	Validation: accuracy = 0.9028938906752412, f1 = 0.8703006276841757
2021-05-28 22:35:52,475	INFO	__main__	Updating ckpt at ./resource/ckpt/glue/qqp/ce/qqp-bert-base-uncased
2021-05-28 22:35:53,740	INFO	torchdistill.misc.log	Epoch: [1]  [    0/22741]  eta: 1:01:53  lr: 3.3332600442665964e-05  sample/s: 28.324724767312098  loss: 0.5533 (0.5533)  time: 0.1633  data: 0.0221  max mem: 3206
2021-05-28 22:38:09,983	INFO	torchdistill.misc.log	Epoch: [1]  [ 1000/22741]  eta: 0:49:22  lr: 3.2599709775295725e-05  sample/s: 39.881372451138404  loss: 0.0934 (0.1804)  time: 0.1297  data: 0.0019  max mem: 3206
2021-05-28 22:40:25,602	INFO	torchdistill.misc.log	Epoch: [1]  [ 2000/22741]  eta: 0:46:59  lr: 3.1866819107925486e-05  sample/s: 39.8993930861285  loss: 0.1000 (0.1880)  time: 0.1340  data: 0.0019  max mem: 3206
2021-05-28 22:42:40,851	INFO	torchdistill.misc.log	Epoch: [1]  [ 3000/22741]  eta: 0:44:39  lr: 3.113392844055524e-05  sample/s: 39.91201722353725  loss: 0.0397 (0.1897)  time: 0.1337  data: 0.0020  max mem: 3206
2021-05-28 22:44:56,820	INFO	torchdistill.misc.log	Epoch: [1]  [ 4000/22741]  eta: 0:42:24  lr: 3.0401037773184997e-05  sample/s: 35.83940938473304  loss: 0.2073 (0.1937)  time: 0.1387  data: 0.0020  max mem: 3206
2021-05-28 22:47:12,892	INFO	torchdistill.misc.log	Epoch: [1]  [ 5000/22741]  eta: 0:40:09  lr: 2.9668147105814757e-05  sample/s: 23.52620072946129  loss: 0.2317 (0.1947)  time: 0.1377  data: 0.0020  max mem: 3206
2021-05-28 22:49:29,022	INFO	torchdistill.misc.log	Epoch: [1]  [ 6000/22741]  eta: 0:37:54  lr: 2.8935256438444515e-05  sample/s: 35.71216075267673  loss: 0.1682 (0.1950)  time: 0.1327  data: 0.0020  max mem: 3206
2021-05-28 22:51:44,757	INFO	torchdistill.misc.log	Epoch: [1]  [ 7000/22741]  eta: 0:35:38  lr: 2.8202365771074275e-05  sample/s: 35.8185971639261  loss: 0.1936 (0.1957)  time: 0.1277  data: 0.0020  max mem: 3206
2021-05-28 22:54:00,779	INFO	torchdistill.misc.log	Epoch: [1]  [ 8000/22741]  eta: 0:33:23  lr: 2.746947510370403e-05  sample/s: 35.78253914764559  loss: 0.1607 (0.2021)  time: 0.1355  data: 0.0020  max mem: 3206
2021-05-28 22:56:16,445	INFO	torchdistill.misc.log	Epoch: [1]  [ 9000/22741]  eta: 0:31:06  lr: 2.673658443633379e-05  sample/s: 29.859710821758846  loss: 0.0592 (0.2015)  time: 0.1354  data: 0.0020  max mem: 3206
2021-05-28 22:58:32,285	INFO	torchdistill.misc.log	Epoch: [1]  [10000/22741]  eta: 0:28:50  lr: 2.600369376896355e-05  sample/s: 23.53702646320642  loss: 0.0289 (0.2013)  time: 0.1486  data: 0.0020  max mem: 3206
2021-05-28 23:00:47,277	INFO	torchdistill.misc.log	Epoch: [1]  [11000/22741]  eta: 0:26:34  lr: 2.5270803101593305e-05  sample/s: 32.66890856445197  loss: 0.0945 (0.2022)  time: 0.1271  data: 0.0019  max mem: 3206
2021-05-28 23:03:03,760	INFO	torchdistill.misc.log	Epoch: [1]  [12000/22741]  eta: 0:24:19  lr: 2.4537912434223065e-05  sample/s: 32.55891074505907  loss: 0.1768 (0.2016)  time: 0.1379  data: 0.0020  max mem: 3206
2021-05-28 23:05:18,319	INFO	torchdistill.misc.log	Epoch: [1]  [13000/22741]  eta: 0:22:02  lr: 2.3805021766852823e-05  sample/s: 32.3476698293464  loss: 0.0365 (0.2002)  time: 0.1354  data: 0.0019  max mem: 3206
2021-05-28 23:07:33,589	INFO	torchdistill.misc.log	Epoch: [1]  [14000/22741]  eta: 0:19:46  lr: 2.307213109948258e-05  sample/s: 29.607043339692197  loss: 0.0187 (0.2008)  time: 0.1379  data: 0.0019  max mem: 3206
2021-05-28 23:09:49,171	INFO	torchdistill.misc.log	Epoch: [1]  [15000/22741]  eta: 0:17:30  lr: 2.2339240432112337e-05  sample/s: 32.65828856186249  loss: 0.3250 (0.2023)  time: 0.1350  data: 0.0019  max mem: 3206
2021-05-28 23:12:04,128	INFO	torchdistill.misc.log	Epoch: [1]  [16000/22741]  eta: 0:15:14  lr: 2.1606349764742098e-05  sample/s: 39.902145036733664  loss: 0.1157 (0.2023)  time: 0.1332  data: 0.0019  max mem: 3206
2021-05-28 23:14:19,963	INFO	torchdistill.misc.log	Epoch: [1]  [17000/22741]  eta: 0:12:58  lr: 2.0873459097371855e-05  sample/s: 29.81842224062732  loss: 0.0740 (0.2035)  time: 0.1371  data: 0.0019  max mem: 3206
2021-05-28 23:16:35,045	INFO	torchdistill.misc.log	Epoch: [1]  [18000/22741]  eta: 0:10:43  lr: 2.0140568430001613e-05  sample/s: 27.17503757839631  loss: 0.1538 (0.2039)  time: 0.1394  data: 0.0022  max mem: 3206
2021-05-28 23:18:51,761	INFO	torchdistill.misc.log	Epoch: [1]  [19000/22741]  eta: 0:08:27  lr: 1.940767776263137e-05  sample/s: 29.899515255203877  loss: 0.2649 (0.2050)  time: 0.1399  data: 0.0019  max mem: 3206
2021-05-28 23:21:07,148	INFO	torchdistill.misc.log	Epoch: [1]  [20000/22741]  eta: 0:06:11  lr: 1.8674787095261127e-05  sample/s: 29.91145595618439  loss: 0.0765 (0.2059)  time: 0.1264  data: 0.0019  max mem: 3206
2021-05-28 23:23:23,180	INFO	torchdistill.misc.log	Epoch: [1]  [21000/22741]  eta: 0:03:56  lr: 1.7941896427890888e-05  sample/s: 35.81056949961473  loss: 0.0140 (0.2059)  time: 0.1309  data: 0.0020  max mem: 3206
2021-05-28 23:25:39,127	INFO	torchdistill.misc.log	Epoch: [1]  [22000/22741]  eta: 0:01:40  lr: 1.7209005760520645e-05  sample/s: 29.85492866014898  loss: 0.1697 (0.2061)  time: 0.1355  data: 0.0020  max mem: 3206
2021-05-28 23:27:19,524	INFO	torchdistill.misc.log	Epoch: [1] Total time: 0:51:25
2021-05-28 23:28:59,456	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/qqp/default_experiment-1-0.arrow
2021-05-28 23:28:59,458	INFO	__main__	Validation: accuracy = 0.9066782092505565, f1 = 0.8765904556307854
2021-05-28 23:28:59,459	INFO	__main__	Updating ckpt at ./resource/ckpt/glue/qqp/ce/qqp-bert-base-uncased
2021-05-28 23:29:00,762	INFO	torchdistill.misc.log	Epoch: [2]  [    0/22741]  eta: 0:52:36  lr: 1.66659337759993e-05  sample/s: 35.41260205503162  loss: 0.0020 (0.0020)  time: 0.1388  data: 0.0258  max mem: 3206
2021-05-28 23:31:16,699	INFO	torchdistill.misc.log	Epoch: [2]  [ 1000/22741]  eta: 0:49:15  lr: 1.5933043108629057e-05  sample/s: 35.72493622530461  loss: 0.0000 (0.2282)  time: 0.1340  data: 0.0020  max mem: 3206
2021-05-28 23:33:33,394	INFO	torchdistill.misc.log	Epoch: [2]  [ 2000/22741]  eta: 0:47:07  lr: 1.5200152441258813e-05  sample/s: 30.27454801694068  loss: 0.0000 (0.2493)  time: 0.1335  data: 0.0019  max mem: 3206
2021-05-28 23:35:48,660	INFO	torchdistill.misc.log	Epoch: [2]  [ 3000/22741]  eta: 0:44:44  lr: 1.4467261773888572e-05  sample/s: 36.234792036525896  loss: 0.0000 (0.2656)  time: 0.1393  data: 0.0019  max mem: 3206
2021-05-28 23:38:03,495	INFO	torchdistill.misc.log	Epoch: [2]  [ 4000/22741]  eta: 0:42:22  lr: 1.373437110651833e-05  sample/s: 27.239410504986875  loss: 0.0000 (0.2649)  time: 0.1332  data: 0.0019  max mem: 3206
2021-05-28 23:40:18,370	INFO	torchdistill.misc.log	Epoch: [2]  [ 5000/22741]  eta: 0:40:04  lr: 1.300148043914809e-05  sample/s: 27.191730267732257  loss: 0.0001 (0.2653)  time: 0.1463  data: 0.0020  max mem: 3206
2021-05-28 23:42:33,360	INFO	torchdistill.misc.log	Epoch: [2]  [ 6000/22741]  eta: 0:37:47  lr: 1.2268589771777847e-05  sample/s: 32.66865411239648  loss: 0.0000 (0.2644)  time: 0.1408  data: 0.0020  max mem: 3206
2021-05-28 23:44:47,853	INFO	torchdistill.misc.log	Epoch: [2]  [ 7000/22741]  eta: 0:35:29  lr: 1.1535699104407605e-05  sample/s: 33.143780558875534  loss: 0.0001 (0.2670)  time: 0.1333  data: 0.0020  max mem: 3206
2021-05-28 23:47:02,473	INFO	torchdistill.misc.log	Epoch: [2]  [ 8000/22741]  eta: 0:33:13  lr: 1.0802808437037364e-05  sample/s: 32.695839261005986  loss: 0.0000 (0.2660)  time: 0.1307  data: 0.0019  max mem: 3206
2021-05-28 23:49:15,855	INFO	torchdistill.misc.log	Epoch: [2]  [ 9000/22741]  eta: 0:30:55  lr: 1.0069917769667121e-05  sample/s: 29.891577835226958  loss: 0.0000 (0.2636)  time: 0.1350  data: 0.0020  max mem: 3206
2021-05-28 23:51:30,840	INFO	torchdistill.misc.log	Epoch: [2]  [10000/22741]  eta: 0:28:40  lr: 9.33702710229688e-06  sample/s: 35.751504987501946  loss: 0.0000 (0.2629)  time: 0.1275  data: 0.0021  max mem: 3206
2021-05-28 23:53:45,522	INFO	torchdistill.misc.log	Epoch: [2]  [11000/22741]  eta: 0:26:24  lr: 8.604136434926637e-06  sample/s: 29.622255337481374  loss: 0.0000 (0.2632)  time: 0.1303  data: 0.0020  max mem: 3206
2021-05-28 23:55:59,507	INFO	torchdistill.misc.log	Epoch: [2]  [12000/22741]  eta: 0:24:08  lr: 7.871245767556396e-06  sample/s: 35.754933635673915  loss: 0.0000 (0.2629)  time: 0.1290  data: 0.0019  max mem: 3206
2021-05-28 23:58:15,335	INFO	torchdistill.misc.log	Epoch: [2]  [13000/22741]  eta: 0:21:54  lr: 7.1383551001861544e-06  sample/s: 35.821579784565124  loss: 0.0058 (0.2626)  time: 0.1291  data: 0.0020  max mem: 3206
2021-05-29 00:00:30,646	INFO	torchdistill.misc.log	Epoch: [2]  [14000/22741]  eta: 0:19:39  lr: 6.405464432815913e-06  sample/s: 33.14227467217681  loss: 0.0000 (0.2610)  time: 0.1348  data: 0.0019  max mem: 3206
2021-05-29 00:02:43,835	INFO	torchdistill.misc.log	Epoch: [2]  [15000/22741]  eta: 0:17:24  lr: 5.672573765445672e-06  sample/s: 27.24148927533408  loss: 0.0000 (0.2609)  time: 0.1321  data: 0.0019  max mem: 3206
2021-05-29 00:04:58,889	INFO	torchdistill.misc.log	Epoch: [2]  [16000/22741]  eta: 0:15:09  lr: 4.93968309807543e-06  sample/s: 35.784523504820406  loss: 0.0000 (0.2591)  time: 0.1300  data: 0.0020  max mem: 3206
2021-05-29 00:07:13,971	INFO	torchdistill.misc.log	Epoch: [2]  [17000/22741]  eta: 0:12:54  lr: 4.206792430705188e-06  sample/s: 29.886306308874037  loss: 0.0000 (0.2589)  time: 0.1387  data: 0.0019  max mem: 3206
2021-05-29 00:09:30,372	INFO	torchdistill.misc.log	Epoch: [2]  [18000/22741]  eta: 0:10:39  lr: 3.473901763334946e-06  sample/s: 32.35147476243367  loss: 0.0000 (0.2577)  time: 0.1410  data: 0.0019  max mem: 3206
2021-05-29 00:11:47,544	INFO	torchdistill.misc.log	Epoch: [2]  [19000/22741]  eta: 0:08:25  lr: 2.7410110959647043e-06  sample/s: 25.485669147804952  loss: 0.0000 (0.2556)  time: 0.1318  data: 0.0019  max mem: 3206
2021-05-29 00:14:02,842	INFO	torchdistill.misc.log	Epoch: [2]  [20000/22741]  eta: 0:06:10  lr: 2.0081204285944624e-06  sample/s: 35.68428418592087  loss: 0.0000 (0.2549)  time: 0.1366  data: 0.0020  max mem: 3206
2021-05-29 00:16:16,820	INFO	torchdistill.misc.log	Epoch: [2]  [21000/22741]  eta: 0:03:55  lr: 1.2752297612242206e-06  sample/s: 30.313933171800834  loss: 0.0000 (0.2551)  time: 0.1303  data: 0.0019  max mem: 3206
2021-05-29 00:18:32,152	INFO	torchdistill.misc.log	Epoch: [2]  [22000/22741]  eta: 0:01:40  lr: 5.423390938539788e-07  sample/s: 27.488987015114898  loss: 0.0000 (0.2537)  time: 0.1327  data: 0.0019  max mem: 3206
2021-05-29 00:20:12,904	INFO	torchdistill.misc.log	Epoch: [2] Total time: 0:51:12
2021-05-29 00:21:52,921	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/qqp/default_experiment-1-0.arrow
2021-05-29 00:21:52,922	INFO	__main__	Validation: accuracy = 0.9093742270591145, f1 = 0.8781833898530488
2021-05-29 00:21:52,923	INFO	__main__	Updating ckpt at ./resource/ckpt/glue/qqp/ce/qqp-bert-base-uncased
2021-05-29 00:21:57,493	INFO	__main__	[Student: bert-base-uncased]
2021-05-29 00:23:37,517	INFO	/usr/local/lib/python3.7/dist-packages/datasets/metric.py	Removing /root/.cache/huggingface/metrics/glue/qqp/default_experiment-1-0.arrow
2021-05-29 00:23:37,519	INFO	__main__	Test: accuracy = 0.9093742270591145, f1 = 0.8781833898530488
2021-05-29 00:23:37,519	INFO	__main__	Start prediction for private dataset(s)
2021-05-29 00:23:37,520	INFO	__main__	qqp/test: 390965 samples