File size: 25,498 Bytes
48fe87f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
2023-10-18 23:49:39,844 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,845 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(31103, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=81, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-18 23:49:39,845 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,845 Corpus: 6900 train + 1576 dev + 1833 test sentences
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,846 Train:  6900 sentences
2023-10-18 23:49:39,846         (train_with_dev=False, train_with_test=False)
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,846 Training Params:
2023-10-18 23:49:39,846  - learning_rate: "5e-05" 
2023-10-18 23:49:39,846  - mini_batch_size: "16"
2023-10-18 23:49:39,846  - max_epochs: "10"
2023-10-18 23:49:39,846  - shuffle: "True"
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,846 Plugins:
2023-10-18 23:49:39,846  - TensorboardLogger
2023-10-18 23:49:39,846  - LinearScheduler | warmup_fraction: '0.1'
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,846 Final evaluation on model from best epoch (best-model.pt)
2023-10-18 23:49:39,846  - metric: "('micro avg', 'f1-score')"
2023-10-18 23:49:39,846 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,846 Computation:
2023-10-18 23:49:39,846  - compute on device: cuda:0
2023-10-18 23:49:39,847  - embedding storage: none
2023-10-18 23:49:39,847 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,847 Model training base path: "autotrain-flair-mobie-gbert_base-bs16-e10-lr5e-05-1"
2023-10-18 23:49:39,847 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,847 ----------------------------------------------------------------------------------------------------
2023-10-18 23:49:39,847 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-18 23:49:54,007 epoch 1 - iter 43/432 - loss 4.54226396 - time (sec): 14.16 - samples/sec: 427.65 - lr: 0.000005 - momentum: 0.000000
2023-10-18 23:50:08,103 epoch 1 - iter 86/432 - loss 3.39025495 - time (sec): 28.26 - samples/sec: 429.20 - lr: 0.000010 - momentum: 0.000000
2023-10-18 23:50:22,380 epoch 1 - iter 129/432 - loss 2.82298352 - time (sec): 42.53 - samples/sec: 431.35 - lr: 0.000015 - momentum: 0.000000
2023-10-18 23:50:35,740 epoch 1 - iter 172/432 - loss 2.47385817 - time (sec): 55.89 - samples/sec: 437.97 - lr: 0.000020 - momentum: 0.000000
2023-10-18 23:50:49,999 epoch 1 - iter 215/432 - loss 2.21787123 - time (sec): 70.15 - samples/sec: 433.61 - lr: 0.000025 - momentum: 0.000000
2023-10-18 23:51:03,183 epoch 1 - iter 258/432 - loss 2.00329484 - time (sec): 83.33 - samples/sec: 440.26 - lr: 0.000030 - momentum: 0.000000
2023-10-18 23:51:16,381 epoch 1 - iter 301/432 - loss 1.82489643 - time (sec): 96.53 - samples/sec: 447.14 - lr: 0.000035 - momentum: 0.000000
2023-10-18 23:51:30,192 epoch 1 - iter 344/432 - loss 1.69046523 - time (sec): 110.34 - samples/sec: 445.93 - lr: 0.000040 - momentum: 0.000000
2023-10-18 23:51:43,742 epoch 1 - iter 387/432 - loss 1.57702628 - time (sec): 123.89 - samples/sec: 444.99 - lr: 0.000045 - momentum: 0.000000
2023-10-18 23:51:58,002 epoch 1 - iter 430/432 - loss 1.47019726 - time (sec): 138.15 - samples/sec: 446.59 - lr: 0.000050 - momentum: 0.000000
2023-10-18 23:51:58,500 ----------------------------------------------------------------------------------------------------
2023-10-18 23:51:58,501 EPOCH 1 done: loss 1.4681 - lr: 0.000050
2023-10-18 23:52:10,741 DEV : loss 0.48693621158599854 - f1-score (micro avg)  0.7049
2023-10-18 23:52:10,765 saving best model
2023-10-18 23:52:11,236 ----------------------------------------------------------------------------------------------------
2023-10-18 23:52:24,640 epoch 2 - iter 43/432 - loss 0.49120434 - time (sec): 13.40 - samples/sec: 475.14 - lr: 0.000049 - momentum: 0.000000
2023-10-18 23:52:37,470 epoch 2 - iter 86/432 - loss 0.47518885 - time (sec): 26.23 - samples/sec: 473.80 - lr: 0.000049 - momentum: 0.000000
2023-10-18 23:52:50,585 epoch 2 - iter 129/432 - loss 0.46930182 - time (sec): 39.35 - samples/sec: 467.10 - lr: 0.000048 - momentum: 0.000000
2023-10-18 23:53:03,976 epoch 2 - iter 172/432 - loss 0.45056911 - time (sec): 52.74 - samples/sec: 469.05 - lr: 0.000048 - momentum: 0.000000
2023-10-18 23:53:18,853 epoch 2 - iter 215/432 - loss 0.43792445 - time (sec): 67.62 - samples/sec: 458.89 - lr: 0.000047 - momentum: 0.000000
2023-10-18 23:53:32,607 epoch 2 - iter 258/432 - loss 0.43015339 - time (sec): 81.37 - samples/sec: 460.55 - lr: 0.000047 - momentum: 0.000000
2023-10-18 23:53:46,790 epoch 2 - iter 301/432 - loss 0.41739009 - time (sec): 95.55 - samples/sec: 457.84 - lr: 0.000046 - momentum: 0.000000
2023-10-18 23:54:00,936 epoch 2 - iter 344/432 - loss 0.40919722 - time (sec): 109.70 - samples/sec: 452.88 - lr: 0.000046 - momentum: 0.000000
2023-10-18 23:54:15,261 epoch 2 - iter 387/432 - loss 0.39789179 - time (sec): 124.02 - samples/sec: 448.77 - lr: 0.000045 - momentum: 0.000000
2023-10-18 23:54:30,460 epoch 2 - iter 430/432 - loss 0.39342103 - time (sec): 139.22 - samples/sec: 443.36 - lr: 0.000044 - momentum: 0.000000
2023-10-18 23:54:30,977 ----------------------------------------------------------------------------------------------------
2023-10-18 23:54:30,978 EPOCH 2 done: loss 0.3932 - lr: 0.000044
2023-10-18 23:54:43,400 DEV : loss 0.3221401870250702 - f1-score (micro avg)  0.7999
2023-10-18 23:54:43,424 saving best model
2023-10-18 23:54:44,719 ----------------------------------------------------------------------------------------------------
2023-10-18 23:54:58,597 epoch 3 - iter 43/432 - loss 0.26073780 - time (sec): 13.88 - samples/sec: 440.91 - lr: 0.000044 - momentum: 0.000000
2023-10-18 23:55:12,781 epoch 3 - iter 86/432 - loss 0.26640971 - time (sec): 28.06 - samples/sec: 433.46 - lr: 0.000043 - momentum: 0.000000
2023-10-18 23:55:27,533 epoch 3 - iter 129/432 - loss 0.25801074 - time (sec): 42.81 - samples/sec: 424.09 - lr: 0.000043 - momentum: 0.000000
2023-10-18 23:55:42,492 epoch 3 - iter 172/432 - loss 0.24957799 - time (sec): 57.77 - samples/sec: 421.11 - lr: 0.000042 - momentum: 0.000000
2023-10-18 23:55:57,651 epoch 3 - iter 215/432 - loss 0.25398995 - time (sec): 72.93 - samples/sec: 417.81 - lr: 0.000042 - momentum: 0.000000
2023-10-18 23:56:12,408 epoch 3 - iter 258/432 - loss 0.25510902 - time (sec): 87.69 - samples/sec: 421.63 - lr: 0.000041 - momentum: 0.000000
2023-10-18 23:56:27,276 epoch 3 - iter 301/432 - loss 0.25604031 - time (sec): 102.55 - samples/sec: 420.25 - lr: 0.000041 - momentum: 0.000000
2023-10-18 23:56:41,333 epoch 3 - iter 344/432 - loss 0.25487803 - time (sec): 116.61 - samples/sec: 424.44 - lr: 0.000040 - momentum: 0.000000
2023-10-18 23:56:55,954 epoch 3 - iter 387/432 - loss 0.25224648 - time (sec): 131.23 - samples/sec: 425.07 - lr: 0.000039 - momentum: 0.000000
2023-10-18 23:57:10,652 epoch 3 - iter 430/432 - loss 0.24985880 - time (sec): 145.93 - samples/sec: 422.88 - lr: 0.000039 - momentum: 0.000000
2023-10-18 23:57:11,301 ----------------------------------------------------------------------------------------------------
2023-10-18 23:57:11,301 EPOCH 3 done: loss 0.2495 - lr: 0.000039
2023-10-18 23:57:24,474 DEV : loss 0.2786136269569397 - f1-score (micro avg)  0.8259
2023-10-18 23:57:24,498 saving best model
2023-10-18 23:57:25,777 ----------------------------------------------------------------------------------------------------
2023-10-18 23:57:39,701 epoch 4 - iter 43/432 - loss 0.17159693 - time (sec): 13.92 - samples/sec: 472.47 - lr: 0.000038 - momentum: 0.000000
2023-10-18 23:57:55,199 epoch 4 - iter 86/432 - loss 0.17789331 - time (sec): 29.42 - samples/sec: 433.90 - lr: 0.000038 - momentum: 0.000000
2023-10-18 23:58:10,250 epoch 4 - iter 129/432 - loss 0.17886504 - time (sec): 44.47 - samples/sec: 428.99 - lr: 0.000037 - momentum: 0.000000
2023-10-18 23:58:24,351 epoch 4 - iter 172/432 - loss 0.17380496 - time (sec): 58.57 - samples/sec: 428.72 - lr: 0.000037 - momentum: 0.000000
2023-10-18 23:58:40,004 epoch 4 - iter 215/432 - loss 0.17639725 - time (sec): 74.23 - samples/sec: 426.47 - lr: 0.000036 - momentum: 0.000000
2023-10-18 23:58:55,251 epoch 4 - iter 258/432 - loss 0.17695320 - time (sec): 89.47 - samples/sec: 425.05 - lr: 0.000036 - momentum: 0.000000
2023-10-18 23:59:10,163 epoch 4 - iter 301/432 - loss 0.17433887 - time (sec): 104.39 - samples/sec: 422.34 - lr: 0.000035 - momentum: 0.000000
2023-10-18 23:59:24,925 epoch 4 - iter 344/432 - loss 0.17056141 - time (sec): 119.15 - samples/sec: 416.50 - lr: 0.000034 - momentum: 0.000000
2023-10-18 23:59:40,330 epoch 4 - iter 387/432 - loss 0.17233317 - time (sec): 134.55 - samples/sec: 414.20 - lr: 0.000034 - momentum: 0.000000
2023-10-18 23:59:55,682 epoch 4 - iter 430/432 - loss 0.17233952 - time (sec): 149.90 - samples/sec: 411.30 - lr: 0.000033 - momentum: 0.000000
2023-10-18 23:59:56,286 ----------------------------------------------------------------------------------------------------
2023-10-18 23:59:56,286 EPOCH 4 done: loss 0.1719 - lr: 0.000033
2023-10-19 00:00:09,553 DEV : loss 0.3027258515357971 - f1-score (micro avg)  0.833
2023-10-19 00:00:09,578 saving best model
2023-10-19 00:00:10,866 ----------------------------------------------------------------------------------------------------
2023-10-19 00:00:24,962 epoch 5 - iter 43/432 - loss 0.13224073 - time (sec): 14.09 - samples/sec: 412.85 - lr: 0.000033 - momentum: 0.000000
2023-10-19 00:00:40,762 epoch 5 - iter 86/432 - loss 0.12870566 - time (sec): 29.89 - samples/sec: 396.63 - lr: 0.000032 - momentum: 0.000000
2023-10-19 00:00:56,297 epoch 5 - iter 129/432 - loss 0.12680988 - time (sec): 45.43 - samples/sec: 391.33 - lr: 0.000032 - momentum: 0.000000
2023-10-19 00:01:11,298 epoch 5 - iter 172/432 - loss 0.12448631 - time (sec): 60.43 - samples/sec: 397.60 - lr: 0.000031 - momentum: 0.000000
2023-10-19 00:01:25,586 epoch 5 - iter 215/432 - loss 0.12976948 - time (sec): 74.72 - samples/sec: 400.65 - lr: 0.000031 - momentum: 0.000000
2023-10-19 00:01:40,737 epoch 5 - iter 258/432 - loss 0.13032162 - time (sec): 89.87 - samples/sec: 401.15 - lr: 0.000030 - momentum: 0.000000
2023-10-19 00:01:55,989 epoch 5 - iter 301/432 - loss 0.13168277 - time (sec): 105.12 - samples/sec: 405.48 - lr: 0.000029 - momentum: 0.000000
2023-10-19 00:02:09,035 epoch 5 - iter 344/432 - loss 0.13191984 - time (sec): 118.17 - samples/sec: 415.52 - lr: 0.000029 - momentum: 0.000000
2023-10-19 00:02:24,258 epoch 5 - iter 387/432 - loss 0.13139996 - time (sec): 133.39 - samples/sec: 414.77 - lr: 0.000028 - momentum: 0.000000
2023-10-19 00:02:40,019 epoch 5 - iter 430/432 - loss 0.13241412 - time (sec): 149.15 - samples/sec: 413.75 - lr: 0.000028 - momentum: 0.000000
2023-10-19 00:02:40,513 ----------------------------------------------------------------------------------------------------
2023-10-19 00:02:40,513 EPOCH 5 done: loss 0.1323 - lr: 0.000028
2023-10-19 00:02:53,879 DEV : loss 0.3319297134876251 - f1-score (micro avg)  0.8275
2023-10-19 00:02:53,909 ----------------------------------------------------------------------------------------------------
2023-10-19 00:03:09,440 epoch 6 - iter 43/432 - loss 0.08999368 - time (sec): 15.53 - samples/sec: 399.56 - lr: 0.000027 - momentum: 0.000000
2023-10-19 00:03:23,839 epoch 6 - iter 86/432 - loss 0.08539150 - time (sec): 29.93 - samples/sec: 429.70 - lr: 0.000027 - momentum: 0.000000
2023-10-19 00:03:38,488 epoch 6 - iter 129/432 - loss 0.09986893 - time (sec): 44.58 - samples/sec: 422.57 - lr: 0.000026 - momentum: 0.000000
2023-10-19 00:03:54,568 epoch 6 - iter 172/432 - loss 0.09913431 - time (sec): 60.66 - samples/sec: 415.33 - lr: 0.000026 - momentum: 0.000000
2023-10-19 00:04:09,161 epoch 6 - iter 215/432 - loss 0.09906129 - time (sec): 75.25 - samples/sec: 409.41 - lr: 0.000025 - momentum: 0.000000
2023-10-19 00:04:23,814 epoch 6 - iter 258/432 - loss 0.09924250 - time (sec): 89.90 - samples/sec: 411.32 - lr: 0.000024 - momentum: 0.000000
2023-10-19 00:04:38,652 epoch 6 - iter 301/432 - loss 0.10059479 - time (sec): 104.74 - samples/sec: 412.85 - lr: 0.000024 - momentum: 0.000000
2023-10-19 00:04:53,540 epoch 6 - iter 344/432 - loss 0.09936378 - time (sec): 119.63 - samples/sec: 413.63 - lr: 0.000023 - momentum: 0.000000
2023-10-19 00:05:08,991 epoch 6 - iter 387/432 - loss 0.09927245 - time (sec): 135.08 - samples/sec: 410.82 - lr: 0.000023 - momentum: 0.000000
2023-10-19 00:05:25,338 epoch 6 - iter 430/432 - loss 0.09843081 - time (sec): 151.43 - samples/sec: 407.47 - lr: 0.000022 - momentum: 0.000000
2023-10-19 00:05:25,876 ----------------------------------------------------------------------------------------------------
2023-10-19 00:05:25,876 EPOCH 6 done: loss 0.0985 - lr: 0.000022
2023-10-19 00:05:39,277 DEV : loss 0.3483152687549591 - f1-score (micro avg)  0.8353
2023-10-19 00:05:39,301 saving best model
2023-10-19 00:05:41,347 ----------------------------------------------------------------------------------------------------
2023-10-19 00:05:56,804 epoch 7 - iter 43/432 - loss 0.07646855 - time (sec): 15.46 - samples/sec: 383.21 - lr: 0.000022 - momentum: 0.000000
2023-10-19 00:06:12,246 epoch 7 - iter 86/432 - loss 0.07564571 - time (sec): 30.90 - samples/sec: 382.43 - lr: 0.000021 - momentum: 0.000000
2023-10-19 00:06:26,486 epoch 7 - iter 129/432 - loss 0.07414267 - time (sec): 45.14 - samples/sec: 405.53 - lr: 0.000021 - momentum: 0.000000
2023-10-19 00:06:41,108 epoch 7 - iter 172/432 - loss 0.07110179 - time (sec): 59.76 - samples/sec: 418.47 - lr: 0.000020 - momentum: 0.000000
2023-10-19 00:06:56,294 epoch 7 - iter 215/432 - loss 0.07443072 - time (sec): 74.95 - samples/sec: 413.30 - lr: 0.000019 - momentum: 0.000000
2023-10-19 00:07:10,672 epoch 7 - iter 258/432 - loss 0.07597555 - time (sec): 89.32 - samples/sec: 417.24 - lr: 0.000019 - momentum: 0.000000
2023-10-19 00:07:25,487 epoch 7 - iter 301/432 - loss 0.07746008 - time (sec): 104.14 - samples/sec: 417.75 - lr: 0.000018 - momentum: 0.000000
2023-10-19 00:07:40,229 epoch 7 - iter 344/432 - loss 0.07686569 - time (sec): 118.88 - samples/sec: 417.03 - lr: 0.000018 - momentum: 0.000000
2023-10-19 00:07:55,052 epoch 7 - iter 387/432 - loss 0.07546838 - time (sec): 133.70 - samples/sec: 413.79 - lr: 0.000017 - momentum: 0.000000
2023-10-19 00:08:09,443 epoch 7 - iter 430/432 - loss 0.07435736 - time (sec): 148.09 - samples/sec: 415.90 - lr: 0.000017 - momentum: 0.000000
2023-10-19 00:08:10,004 ----------------------------------------------------------------------------------------------------
2023-10-19 00:08:10,005 EPOCH 7 done: loss 0.0749 - lr: 0.000017
2023-10-19 00:08:22,467 DEV : loss 0.35942888259887695 - f1-score (micro avg)  0.8395
2023-10-19 00:08:22,492 saving best model
2023-10-19 00:08:23,780 ----------------------------------------------------------------------------------------------------
2023-10-19 00:08:37,812 epoch 8 - iter 43/432 - loss 0.05403566 - time (sec): 14.03 - samples/sec: 443.97 - lr: 0.000016 - momentum: 0.000000
2023-10-19 00:08:50,468 epoch 8 - iter 86/432 - loss 0.05416407 - time (sec): 26.69 - samples/sec: 465.48 - lr: 0.000016 - momentum: 0.000000
2023-10-19 00:09:05,136 epoch 8 - iter 129/432 - loss 0.05231093 - time (sec): 41.35 - samples/sec: 445.46 - lr: 0.000015 - momentum: 0.000000
2023-10-19 00:09:19,064 epoch 8 - iter 172/432 - loss 0.05332684 - time (sec): 55.28 - samples/sec: 439.02 - lr: 0.000014 - momentum: 0.000000
2023-10-19 00:09:32,357 epoch 8 - iter 215/432 - loss 0.05490906 - time (sec): 68.58 - samples/sec: 442.24 - lr: 0.000014 - momentum: 0.000000
2023-10-19 00:09:45,819 epoch 8 - iter 258/432 - loss 0.05470571 - time (sec): 82.04 - samples/sec: 445.59 - lr: 0.000013 - momentum: 0.000000
2023-10-19 00:09:59,775 epoch 8 - iter 301/432 - loss 0.05508927 - time (sec): 95.99 - samples/sec: 444.48 - lr: 0.000013 - momentum: 0.000000
2023-10-19 00:10:13,551 epoch 8 - iter 344/432 - loss 0.05696792 - time (sec): 109.77 - samples/sec: 443.76 - lr: 0.000012 - momentum: 0.000000
2023-10-19 00:10:28,569 epoch 8 - iter 387/432 - loss 0.05709499 - time (sec): 124.79 - samples/sec: 440.02 - lr: 0.000012 - momentum: 0.000000
2023-10-19 00:10:42,292 epoch 8 - iter 430/432 - loss 0.05606310 - time (sec): 138.51 - samples/sec: 445.16 - lr: 0.000011 - momentum: 0.000000
2023-10-19 00:10:42,821 ----------------------------------------------------------------------------------------------------
2023-10-19 00:10:42,821 EPOCH 8 done: loss 0.0560 - lr: 0.000011
2023-10-19 00:10:55,201 DEV : loss 0.3855433464050293 - f1-score (micro avg)  0.8399
2023-10-19 00:10:55,225 saving best model
2023-10-19 00:10:56,513 ----------------------------------------------------------------------------------------------------
2023-10-19 00:11:09,243 epoch 9 - iter 43/432 - loss 0.04130631 - time (sec): 12.73 - samples/sec: 503.43 - lr: 0.000011 - momentum: 0.000000
2023-10-19 00:11:22,023 epoch 9 - iter 86/432 - loss 0.03767518 - time (sec): 25.51 - samples/sec: 484.19 - lr: 0.000010 - momentum: 0.000000
2023-10-19 00:11:35,992 epoch 9 - iter 129/432 - loss 0.04012592 - time (sec): 39.48 - samples/sec: 467.91 - lr: 0.000009 - momentum: 0.000000
2023-10-19 00:11:50,170 epoch 9 - iter 172/432 - loss 0.03874753 - time (sec): 53.66 - samples/sec: 461.63 - lr: 0.000009 - momentum: 0.000000
2023-10-19 00:12:03,449 epoch 9 - iter 215/432 - loss 0.03749511 - time (sec): 66.93 - samples/sec: 465.10 - lr: 0.000008 - momentum: 0.000000
2023-10-19 00:12:17,477 epoch 9 - iter 258/432 - loss 0.03805076 - time (sec): 80.96 - samples/sec: 460.64 - lr: 0.000008 - momentum: 0.000000
2023-10-19 00:12:30,864 epoch 9 - iter 301/432 - loss 0.03919902 - time (sec): 94.35 - samples/sec: 462.58 - lr: 0.000007 - momentum: 0.000000
2023-10-19 00:12:45,129 epoch 9 - iter 344/432 - loss 0.04055903 - time (sec): 108.62 - samples/sec: 455.14 - lr: 0.000007 - momentum: 0.000000
2023-10-19 00:12:59,088 epoch 9 - iter 387/432 - loss 0.04016294 - time (sec): 122.57 - samples/sec: 452.94 - lr: 0.000006 - momentum: 0.000000
2023-10-19 00:13:12,844 epoch 9 - iter 430/432 - loss 0.04071803 - time (sec): 136.33 - samples/sec: 452.09 - lr: 0.000006 - momentum: 0.000000
2023-10-19 00:13:13,346 ----------------------------------------------------------------------------------------------------
2023-10-19 00:13:13,346 EPOCH 9 done: loss 0.0406 - lr: 0.000006
2023-10-19 00:13:25,615 DEV : loss 0.398950457572937 - f1-score (micro avg)  0.8446
2023-10-19 00:13:25,640 saving best model
2023-10-19 00:13:26,920 ----------------------------------------------------------------------------------------------------
2023-10-19 00:13:40,602 epoch 10 - iter 43/432 - loss 0.03991636 - time (sec): 13.68 - samples/sec: 438.30 - lr: 0.000005 - momentum: 0.000000
2023-10-19 00:13:54,198 epoch 10 - iter 86/432 - loss 0.03709518 - time (sec): 27.28 - samples/sec: 436.13 - lr: 0.000004 - momentum: 0.000000
2023-10-19 00:14:07,245 epoch 10 - iter 129/432 - loss 0.03665230 - time (sec): 40.32 - samples/sec: 453.56 - lr: 0.000004 - momentum: 0.000000
2023-10-19 00:14:21,656 epoch 10 - iter 172/432 - loss 0.03488483 - time (sec): 54.73 - samples/sec: 449.35 - lr: 0.000003 - momentum: 0.000000
2023-10-19 00:14:35,583 epoch 10 - iter 215/432 - loss 0.03260316 - time (sec): 68.66 - samples/sec: 449.20 - lr: 0.000003 - momentum: 0.000000
2023-10-19 00:14:50,542 epoch 10 - iter 258/432 - loss 0.03311250 - time (sec): 83.62 - samples/sec: 442.51 - lr: 0.000002 - momentum: 0.000000
2023-10-19 00:15:04,584 epoch 10 - iter 301/432 - loss 0.03267573 - time (sec): 97.66 - samples/sec: 440.01 - lr: 0.000002 - momentum: 0.000000
2023-10-19 00:15:20,118 epoch 10 - iter 344/432 - loss 0.03135514 - time (sec): 113.20 - samples/sec: 434.14 - lr: 0.000001 - momentum: 0.000000
2023-10-19 00:15:35,441 epoch 10 - iter 387/432 - loss 0.03196909 - time (sec): 128.52 - samples/sec: 431.63 - lr: 0.000001 - momentum: 0.000000
2023-10-19 00:15:50,021 epoch 10 - iter 430/432 - loss 0.03303907 - time (sec): 143.10 - samples/sec: 430.68 - lr: 0.000000 - momentum: 0.000000
2023-10-19 00:15:50,585 ----------------------------------------------------------------------------------------------------
2023-10-19 00:15:50,585 EPOCH 10 done: loss 0.0330 - lr: 0.000000
2023-10-19 00:16:03,688 DEV : loss 0.41425013542175293 - f1-score (micro avg)  0.8446
2023-10-19 00:16:03,714 saving best model
2023-10-19 00:16:05,498 ----------------------------------------------------------------------------------------------------
2023-10-19 00:16:05,500 Loading model from best epoch ...
2023-10-19 00:16:07,893 SequenceTagger predicts: Dictionary with 81 tags: O, S-location-route, B-location-route, E-location-route, I-location-route, S-location-stop, B-location-stop, E-location-stop, I-location-stop, S-trigger, B-trigger, E-trigger, I-trigger, S-organization-company, B-organization-company, E-organization-company, I-organization-company, S-location-city, B-location-city, E-location-city, I-location-city, S-location, B-location, E-location, I-location, S-event-cause, B-event-cause, E-event-cause, I-event-cause, S-location-street, B-location-street, E-location-street, I-location-street, S-time, B-time, E-time, I-time, S-date, B-date, E-date, I-date, S-number, B-number, E-number, I-number, S-duration, B-duration, E-duration, I-duration, S-organization
2023-10-19 00:16:25,689 
Results:
- F-score (micro) 0.7706
- F-score (macro) 0.5891
- Accuracy 0.6737

By class:
                      precision    recall  f1-score   support

             trigger     0.7277    0.6062    0.6614       833
       location-stop     0.8421    0.8157    0.8287       765
            location     0.8096    0.8376    0.8234       665
       location-city     0.8068    0.8852    0.8441       566
                date     0.9013    0.8579    0.8791       394
     location-street     0.9417    0.8782    0.9088       386
                time     0.7855    0.8867    0.8330       256
      location-route     0.9053    0.7746    0.8349       284
organization-company     0.7838    0.6905    0.7342       252
            distance     0.9882    1.0000    0.9940       167
              number     0.6760    0.8121    0.7378       149
            duration     0.3484    0.3313    0.3396       163
         event-cause     0.0000    0.0000    0.0000         0
       disaster-type     0.9310    0.3913    0.5510        69
        organization     0.5769    0.5357    0.5556        28
              person     0.5000    1.0000    0.6667        10
                 set     0.0000    0.0000    0.0000         0
        org-position     0.0000    0.0000    0.0000         1
               money     0.0000    0.0000    0.0000         0

           micro avg     0.7637    0.7777    0.7706      4988
           macro avg     0.6065    0.5949    0.5891      4988
        weighted avg     0.8075    0.7777    0.7888      4988

2023-10-19 00:16:25,690 ----------------------------------------------------------------------------------------------------