File size: 23,835 Bytes
e74e2c0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
2023-10-16 18:40:58,252 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,253 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(32001, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=17, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-16 18:40:58,253 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,253 MultiCorpus: 1166 train + 165 dev + 415 test sentences
 - NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Train:  1166 sentences
2023-10-16 18:40:58,254         (train_with_dev=False, train_with_test=False)
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Training Params:
2023-10-16 18:40:58,254  - learning_rate: "3e-05" 
2023-10-16 18:40:58,254  - mini_batch_size: "8"
2023-10-16 18:40:58,254  - max_epochs: "10"
2023-10-16 18:40:58,254  - shuffle: "True"
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Plugins:
2023-10-16 18:40:58,254  - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:40:58,254  - metric: "('micro avg', 'f1-score')"
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Computation:
2023-10-16 18:40:58,254  - compute on device: cuda:0
2023-10-16 18:40:58,254  - embedding storage: none
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:58,254 ----------------------------------------------------------------------------------------------------
2023-10-16 18:40:59,717 epoch 1 - iter 14/146 - loss 2.97390099 - time (sec): 1.46 - samples/sec: 3017.24 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:41:00,900 epoch 1 - iter 28/146 - loss 2.77526217 - time (sec): 2.64 - samples/sec: 3044.65 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:41:02,535 epoch 1 - iter 42/146 - loss 2.37063522 - time (sec): 4.28 - samples/sec: 2990.83 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:41:04,316 epoch 1 - iter 56/146 - loss 1.94618735 - time (sec): 6.06 - samples/sec: 2861.03 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:41:05,651 epoch 1 - iter 70/146 - loss 1.73354724 - time (sec): 7.40 - samples/sec: 2856.99 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:41:07,365 epoch 1 - iter 84/146 - loss 1.56843020 - time (sec): 9.11 - samples/sec: 2831.84 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:41:08,780 epoch 1 - iter 98/146 - loss 1.41105109 - time (sec): 10.53 - samples/sec: 2856.75 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:41:10,204 epoch 1 - iter 112/146 - loss 1.27309307 - time (sec): 11.95 - samples/sec: 2885.22 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:41:11,681 epoch 1 - iter 126/146 - loss 1.16627535 - time (sec): 13.43 - samples/sec: 2901.13 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:41:12,960 epoch 1 - iter 140/146 - loss 1.08434747 - time (sec): 14.70 - samples/sec: 2924.63 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:41:13,475 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:13,475 EPOCH 1 done: loss 1.0599 - lr: 0.000029
2023-10-16 18:41:14,283 DEV : loss 0.22420375049114227 - f1-score (micro avg)  0.3689
2023-10-16 18:41:14,287 saving best model
2023-10-16 18:41:14,730 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:16,109 epoch 2 - iter 14/146 - loss 0.29776445 - time (sec): 1.38 - samples/sec: 3082.48 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:41:17,774 epoch 2 - iter 28/146 - loss 0.31816820 - time (sec): 3.04 - samples/sec: 3091.80 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:41:19,469 epoch 2 - iter 42/146 - loss 0.33197211 - time (sec): 4.74 - samples/sec: 2877.88 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:41:21,233 epoch 2 - iter 56/146 - loss 0.29603529 - time (sec): 6.50 - samples/sec: 2823.99 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:41:22,578 epoch 2 - iter 70/146 - loss 0.28356557 - time (sec): 7.85 - samples/sec: 2835.26 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:41:23,993 epoch 2 - iter 84/146 - loss 0.27591257 - time (sec): 9.26 - samples/sec: 2858.44 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:41:25,233 epoch 2 - iter 98/146 - loss 0.26858017 - time (sec): 10.50 - samples/sec: 2878.25 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:41:26,511 epoch 2 - iter 112/146 - loss 0.25364260 - time (sec): 11.78 - samples/sec: 2913.62 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:41:28,100 epoch 2 - iter 126/146 - loss 0.24500735 - time (sec): 13.37 - samples/sec: 2896.51 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:41:29,365 epoch 2 - iter 140/146 - loss 0.23870700 - time (sec): 14.63 - samples/sec: 2907.77 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:41:30,053 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:30,053 EPOCH 2 done: loss 0.2330 - lr: 0.000027
2023-10-16 18:41:31,652 DEV : loss 0.13390937447547913 - f1-score (micro avg)  0.6225
2023-10-16 18:41:31,657 saving best model
2023-10-16 18:41:32,194 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:33,744 epoch 3 - iter 14/146 - loss 0.10987697 - time (sec): 1.55 - samples/sec: 3085.26 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:41:35,595 epoch 3 - iter 28/146 - loss 0.12399653 - time (sec): 3.40 - samples/sec: 2866.15 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:41:36,764 epoch 3 - iter 42/146 - loss 0.13044732 - time (sec): 4.57 - samples/sec: 2925.68 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:41:37,927 epoch 3 - iter 56/146 - loss 0.12721141 - time (sec): 5.73 - samples/sec: 2937.70 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:41:39,222 epoch 3 - iter 70/146 - loss 0.12759325 - time (sec): 7.03 - samples/sec: 2956.17 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:41:40,773 epoch 3 - iter 84/146 - loss 0.12258261 - time (sec): 8.58 - samples/sec: 2989.78 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:41:42,438 epoch 3 - iter 98/146 - loss 0.12529590 - time (sec): 10.24 - samples/sec: 3008.63 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:41:43,950 epoch 3 - iter 112/146 - loss 0.12515723 - time (sec): 11.75 - samples/sec: 2983.18 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:41:45,318 epoch 3 - iter 126/146 - loss 0.12458492 - time (sec): 13.12 - samples/sec: 2974.87 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:41:46,842 epoch 3 - iter 140/146 - loss 0.12705098 - time (sec): 14.65 - samples/sec: 2933.52 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:41:47,343 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:47,343 EPOCH 3 done: loss 0.1269 - lr: 0.000024
2023-10-16 18:41:48,612 DEV : loss 0.1261664777994156 - f1-score (micro avg)  0.7047
2023-10-16 18:41:48,617 saving best model
2023-10-16 18:41:49,190 ----------------------------------------------------------------------------------------------------
2023-10-16 18:41:50,833 epoch 4 - iter 14/146 - loss 0.08426860 - time (sec): 1.64 - samples/sec: 3083.74 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:41:52,488 epoch 4 - iter 28/146 - loss 0.09754780 - time (sec): 3.29 - samples/sec: 2865.44 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:41:53,974 epoch 4 - iter 42/146 - loss 0.08349681 - time (sec): 4.78 - samples/sec: 2919.36 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:41:55,124 epoch 4 - iter 56/146 - loss 0.08357742 - time (sec): 5.93 - samples/sec: 2923.52 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:41:56,664 epoch 4 - iter 70/146 - loss 0.08252609 - time (sec): 7.47 - samples/sec: 2933.63 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:41:58,302 epoch 4 - iter 84/146 - loss 0.08292927 - time (sec): 9.11 - samples/sec: 2881.12 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:41:59,580 epoch 4 - iter 98/146 - loss 0.08204986 - time (sec): 10.39 - samples/sec: 2876.23 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:42:01,361 epoch 4 - iter 112/146 - loss 0.08113285 - time (sec): 12.17 - samples/sec: 2880.19 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:42:02,669 epoch 4 - iter 126/146 - loss 0.08204963 - time (sec): 13.47 - samples/sec: 2909.89 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:42:03,984 epoch 4 - iter 140/146 - loss 0.08229389 - time (sec): 14.79 - samples/sec: 2910.43 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:42:04,425 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:04,425 EPOCH 4 done: loss 0.0823 - lr: 0.000020
2023-10-16 18:42:05,739 DEV : loss 0.11765624582767487 - f1-score (micro avg)  0.7382
2023-10-16 18:42:05,746 saving best model
2023-10-16 18:42:06,266 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:07,846 epoch 5 - iter 14/146 - loss 0.05045388 - time (sec): 1.58 - samples/sec: 2745.48 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:42:09,412 epoch 5 - iter 28/146 - loss 0.04670668 - time (sec): 3.14 - samples/sec: 2646.53 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:42:10,934 epoch 5 - iter 42/146 - loss 0.04342810 - time (sec): 4.66 - samples/sec: 2726.93 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:42:12,625 epoch 5 - iter 56/146 - loss 0.05072506 - time (sec): 6.35 - samples/sec: 2831.43 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:42:14,211 epoch 5 - iter 70/146 - loss 0.04968675 - time (sec): 7.94 - samples/sec: 2863.10 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:42:15,525 epoch 5 - iter 84/146 - loss 0.05401489 - time (sec): 9.25 - samples/sec: 2881.42 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:42:17,136 epoch 5 - iter 98/146 - loss 0.05351693 - time (sec): 10.87 - samples/sec: 2916.64 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:42:18,316 epoch 5 - iter 112/146 - loss 0.05569047 - time (sec): 12.04 - samples/sec: 2930.09 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:42:19,708 epoch 5 - iter 126/146 - loss 0.05757850 - time (sec): 13.44 - samples/sec: 2926.24 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:42:20,948 epoch 5 - iter 140/146 - loss 0.05972547 - time (sec): 14.68 - samples/sec: 2944.98 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:42:21,431 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:21,431 EPOCH 5 done: loss 0.0601 - lr: 0.000017
2023-10-16 18:42:22,719 DEV : loss 0.10323068499565125 - f1-score (micro avg)  0.7639
2023-10-16 18:42:22,724 saving best model
2023-10-16 18:42:23,609 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:24,951 epoch 6 - iter 14/146 - loss 0.06420642 - time (sec): 1.34 - samples/sec: 3160.55 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:42:26,500 epoch 6 - iter 28/146 - loss 0.05228968 - time (sec): 2.89 - samples/sec: 3171.17 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:42:27,809 epoch 6 - iter 42/146 - loss 0.04666758 - time (sec): 4.20 - samples/sec: 3131.77 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:42:29,236 epoch 6 - iter 56/146 - loss 0.04726276 - time (sec): 5.62 - samples/sec: 3044.60 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:42:30,757 epoch 6 - iter 70/146 - loss 0.04494478 - time (sec): 7.14 - samples/sec: 3007.68 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:42:32,150 epoch 6 - iter 84/146 - loss 0.04396196 - time (sec): 8.54 - samples/sec: 2974.60 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:42:33,587 epoch 6 - iter 98/146 - loss 0.04190146 - time (sec): 9.97 - samples/sec: 3005.86 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:42:34,981 epoch 6 - iter 112/146 - loss 0.04097606 - time (sec): 11.37 - samples/sec: 3021.65 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:42:36,451 epoch 6 - iter 126/146 - loss 0.04174709 - time (sec): 12.84 - samples/sec: 3030.85 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:42:37,774 epoch 6 - iter 140/146 - loss 0.04259495 - time (sec): 14.16 - samples/sec: 3013.59 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:42:38,410 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:38,411 EPOCH 6 done: loss 0.0429 - lr: 0.000014
2023-10-16 18:42:39,666 DEV : loss 0.12086369842290878 - f1-score (micro avg)  0.7331
2023-10-16 18:42:39,671 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:41,060 epoch 7 - iter 14/146 - loss 0.03536041 - time (sec): 1.39 - samples/sec: 3083.88 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:42:42,338 epoch 7 - iter 28/146 - loss 0.03135766 - time (sec): 2.67 - samples/sec: 3113.18 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:42:43,772 epoch 7 - iter 42/146 - loss 0.02784114 - time (sec): 4.10 - samples/sec: 3139.52 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:42:45,298 epoch 7 - iter 56/146 - loss 0.02751308 - time (sec): 5.63 - samples/sec: 3096.78 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:42:46,664 epoch 7 - iter 70/146 - loss 0.02605524 - time (sec): 6.99 - samples/sec: 3045.10 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:42:48,465 epoch 7 - iter 84/146 - loss 0.03041736 - time (sec): 8.79 - samples/sec: 2978.93 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:42:49,866 epoch 7 - iter 98/146 - loss 0.02942324 - time (sec): 10.19 - samples/sec: 2977.37 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:42:51,429 epoch 7 - iter 112/146 - loss 0.03021149 - time (sec): 11.76 - samples/sec: 2934.42 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:42:52,967 epoch 7 - iter 126/146 - loss 0.03017723 - time (sec): 13.30 - samples/sec: 2911.90 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:42:54,512 epoch 7 - iter 140/146 - loss 0.03237412 - time (sec): 14.84 - samples/sec: 2895.37 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:42:55,004 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:55,005 EPOCH 7 done: loss 0.0319 - lr: 0.000010
2023-10-16 18:42:56,306 DEV : loss 0.12415074557065964 - f1-score (micro avg)  0.7716
2023-10-16 18:42:56,312 saving best model
2023-10-16 18:42:56,844 ----------------------------------------------------------------------------------------------------
2023-10-16 18:42:58,269 epoch 8 - iter 14/146 - loss 0.01649001 - time (sec): 1.42 - samples/sec: 3042.84 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:42:59,833 epoch 8 - iter 28/146 - loss 0.01761338 - time (sec): 2.99 - samples/sec: 2985.23 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:43:01,519 epoch 8 - iter 42/146 - loss 0.02810541 - time (sec): 4.67 - samples/sec: 2933.19 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:43:02,724 epoch 8 - iter 56/146 - loss 0.02894650 - time (sec): 5.88 - samples/sec: 2883.58 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:43:04,191 epoch 8 - iter 70/146 - loss 0.02727455 - time (sec): 7.34 - samples/sec: 2927.98 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:43:05,867 epoch 8 - iter 84/146 - loss 0.02891311 - time (sec): 9.02 - samples/sec: 2917.72 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:43:07,161 epoch 8 - iter 98/146 - loss 0.02597422 - time (sec): 10.31 - samples/sec: 2963.64 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:43:08,650 epoch 8 - iter 112/146 - loss 0.02727534 - time (sec): 11.80 - samples/sec: 2946.61 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:43:10,154 epoch 8 - iter 126/146 - loss 0.02628321 - time (sec): 13.31 - samples/sec: 2947.47 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:43:11,406 epoch 8 - iter 140/146 - loss 0.02579516 - time (sec): 14.56 - samples/sec: 2952.88 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:43:11,909 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:11,909 EPOCH 8 done: loss 0.0259 - lr: 0.000007
2023-10-16 18:43:13,196 DEV : loss 0.1355086863040924 - f1-score (micro avg)  0.7191
2023-10-16 18:43:13,202 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:14,688 epoch 9 - iter 14/146 - loss 0.05423101 - time (sec): 1.48 - samples/sec: 3174.47 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:43:16,439 epoch 9 - iter 28/146 - loss 0.04240930 - time (sec): 3.24 - samples/sec: 2894.49 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:43:17,996 epoch 9 - iter 42/146 - loss 0.03263167 - time (sec): 4.79 - samples/sec: 2719.93 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:43:19,384 epoch 9 - iter 56/146 - loss 0.02854858 - time (sec): 6.18 - samples/sec: 2772.88 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:43:20,598 epoch 9 - iter 70/146 - loss 0.02688335 - time (sec): 7.39 - samples/sec: 2826.17 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:43:21,940 epoch 9 - iter 84/146 - loss 0.02662945 - time (sec): 8.74 - samples/sec: 2861.28 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:43:23,486 epoch 9 - iter 98/146 - loss 0.02365962 - time (sec): 10.28 - samples/sec: 2875.77 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:43:24,796 epoch 9 - iter 112/146 - loss 0.02270041 - time (sec): 11.59 - samples/sec: 2868.17 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:43:26,452 epoch 9 - iter 126/146 - loss 0.02197911 - time (sec): 13.25 - samples/sec: 2846.24 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:43:28,011 epoch 9 - iter 140/146 - loss 0.02169160 - time (sec): 14.81 - samples/sec: 2863.71 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:43:28,594 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:28,595 EPOCH 9 done: loss 0.0215 - lr: 0.000004
2023-10-16 18:43:29,856 DEV : loss 0.13685813546180725 - f1-score (micro avg)  0.7484
2023-10-16 18:43:29,862 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:31,152 epoch 10 - iter 14/146 - loss 0.02494954 - time (sec): 1.29 - samples/sec: 2857.52 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:43:32,476 epoch 10 - iter 28/146 - loss 0.01602085 - time (sec): 2.61 - samples/sec: 3020.65 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:43:33,794 epoch 10 - iter 42/146 - loss 0.01621527 - time (sec): 3.93 - samples/sec: 3066.78 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:43:35,570 epoch 10 - iter 56/146 - loss 0.01678019 - time (sec): 5.71 - samples/sec: 2969.94 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:43:37,094 epoch 10 - iter 70/146 - loss 0.01639143 - time (sec): 7.23 - samples/sec: 3007.77 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:43:38,504 epoch 10 - iter 84/146 - loss 0.01946257 - time (sec): 8.64 - samples/sec: 3023.29 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:43:39,791 epoch 10 - iter 98/146 - loss 0.01874862 - time (sec): 9.93 - samples/sec: 3038.93 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:43:41,180 epoch 10 - iter 112/146 - loss 0.01865213 - time (sec): 11.32 - samples/sec: 3027.93 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:43:42,660 epoch 10 - iter 126/146 - loss 0.01993260 - time (sec): 12.80 - samples/sec: 3034.99 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:43:44,108 epoch 10 - iter 140/146 - loss 0.01869978 - time (sec): 14.25 - samples/sec: 3042.21 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:43:44,558 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:44,558 EPOCH 10 done: loss 0.0183 - lr: 0.000000
2023-10-16 18:43:45,833 DEV : loss 0.14017795026302338 - f1-score (micro avg)  0.742
2023-10-16 18:43:46,218 ----------------------------------------------------------------------------------------------------
2023-10-16 18:43:46,219 Loading model from best epoch ...
2023-10-16 18:43:47,826 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 18:43:50,204 
Results:
- F-score (micro) 0.7545
- F-score (macro) 0.6678
- Accuracy 0.6271

By class:
              precision    recall  f1-score   support

         PER     0.8027    0.8420    0.8219       348
         LOC     0.6707    0.8429    0.7470       261
         ORG     0.3500    0.4038    0.3750        52
   HumanProd     0.7273    0.7273    0.7273        22

   micro avg     0.7097    0.8053    0.7545       683
   macro avg     0.6377    0.7040    0.6678       683
weighted avg     0.7154    0.8053    0.7562       683

2023-10-16 18:43:50,205 ----------------------------------------------------------------------------------------------------