File size: 23,748 Bytes
d308d3a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
2023-10-17 18:51:00,787 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,788 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): ElectraModel(
      (embeddings): ElectraEmbeddings(
        (word_embeddings): Embedding(32001, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): ElectraEncoder(
        (layer): ModuleList(
          (0-11): 12 x ElectraLayer(
            (attention): ElectraAttention(
              (self): ElectraSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): ElectraSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): ElectraIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): ElectraOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=13, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,788 MultiCorpus: 5777 train + 722 dev + 723 test sentences
 - NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,788 Train:  5777 sentences
2023-10-17 18:51:00,788         (train_with_dev=False, train_with_test=False)
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,788 Training Params:
2023-10-17 18:51:00,788  - learning_rate: "3e-05" 
2023-10-17 18:51:00,788  - mini_batch_size: "8"
2023-10-17 18:51:00,788  - max_epochs: "10"
2023-10-17 18:51:00,788  - shuffle: "True"
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,788 Plugins:
2023-10-17 18:51:00,788  - TensorboardLogger
2023-10-17 18:51:00,788  - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,788 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 18:51:00,788  - metric: "('micro avg', 'f1-score')"
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,788 Computation:
2023-10-17 18:51:00,788  - compute on device: cuda:0
2023-10-17 18:51:00,788  - embedding storage: none
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,788 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-17 18:51:00,788 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,789 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:00,789 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 18:51:05,999 epoch 1 - iter 72/723 - loss 2.73572129 - time (sec): 5.21 - samples/sec: 3228.40 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:51:11,090 epoch 1 - iter 144/723 - loss 1.71636677 - time (sec): 10.30 - samples/sec: 3297.56 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:51:16,224 epoch 1 - iter 216/723 - loss 1.21015689 - time (sec): 15.43 - samples/sec: 3319.86 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:51:21,390 epoch 1 - iter 288/723 - loss 0.94420268 - time (sec): 20.60 - samples/sec: 3347.59 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:51:26,193 epoch 1 - iter 360/723 - loss 0.78590918 - time (sec): 25.40 - samples/sec: 3395.27 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:51:31,477 epoch 1 - iter 432/723 - loss 0.67292077 - time (sec): 30.69 - samples/sec: 3400.44 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:51:36,674 epoch 1 - iter 504/723 - loss 0.59756926 - time (sec): 35.88 - samples/sec: 3400.07 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:51:42,043 epoch 1 - iter 576/723 - loss 0.53717802 - time (sec): 41.25 - samples/sec: 3387.98 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:51:47,518 epoch 1 - iter 648/723 - loss 0.48924915 - time (sec): 46.73 - samples/sec: 3371.37 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:51:52,896 epoch 1 - iter 720/723 - loss 0.45323298 - time (sec): 52.11 - samples/sec: 3367.96 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:51:53,104 ----------------------------------------------------------------------------------------------------
2023-10-17 18:51:53,105 EPOCH 1 done: loss 0.4518 - lr: 0.000030
2023-10-17 18:51:55,813 DEV : loss 0.08571955561637878 - f1-score (micro avg)  0.7621
2023-10-17 18:51:55,830 saving best model
2023-10-17 18:51:56,351 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:01,211 epoch 2 - iter 72/723 - loss 0.09693799 - time (sec): 4.86 - samples/sec: 3414.32 - lr: 0.000030 - momentum: 0.000000
2023-10-17 18:52:06,705 epoch 2 - iter 144/723 - loss 0.09826077 - time (sec): 10.35 - samples/sec: 3307.89 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:52:11,708 epoch 2 - iter 216/723 - loss 0.09503249 - time (sec): 15.36 - samples/sec: 3367.88 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:52:16,853 epoch 2 - iter 288/723 - loss 0.09891978 - time (sec): 20.50 - samples/sec: 3362.88 - lr: 0.000029 - momentum: 0.000000
2023-10-17 18:52:22,304 epoch 2 - iter 360/723 - loss 0.09274178 - time (sec): 25.95 - samples/sec: 3380.81 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:52:27,873 epoch 2 - iter 432/723 - loss 0.08927963 - time (sec): 31.52 - samples/sec: 3401.91 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:52:32,724 epoch 2 - iter 504/723 - loss 0.09084334 - time (sec): 36.37 - samples/sec: 3381.60 - lr: 0.000028 - momentum: 0.000000
2023-10-17 18:52:38,353 epoch 2 - iter 576/723 - loss 0.09010738 - time (sec): 42.00 - samples/sec: 3375.35 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:52:43,617 epoch 2 - iter 648/723 - loss 0.08974176 - time (sec): 47.26 - samples/sec: 3348.81 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:52:48,857 epoch 2 - iter 720/723 - loss 0.08723098 - time (sec): 52.50 - samples/sec: 3343.67 - lr: 0.000027 - momentum: 0.000000
2023-10-17 18:52:49,018 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:49,019 EPOCH 2 done: loss 0.0871 - lr: 0.000027
2023-10-17 18:52:52,238 DEV : loss 0.05628642439842224 - f1-score (micro avg)  0.8664
2023-10-17 18:52:52,255 saving best model
2023-10-17 18:52:52,649 ----------------------------------------------------------------------------------------------------
2023-10-17 18:52:58,275 epoch 3 - iter 72/723 - loss 0.06108376 - time (sec): 5.62 - samples/sec: 3077.41 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:53:03,076 epoch 3 - iter 144/723 - loss 0.06229454 - time (sec): 10.43 - samples/sec: 3255.95 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:53:08,621 epoch 3 - iter 216/723 - loss 0.06269040 - time (sec): 15.97 - samples/sec: 3261.71 - lr: 0.000026 - momentum: 0.000000
2023-10-17 18:53:14,059 epoch 3 - iter 288/723 - loss 0.05782452 - time (sec): 21.41 - samples/sec: 3273.63 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:53:19,376 epoch 3 - iter 360/723 - loss 0.05776341 - time (sec): 26.73 - samples/sec: 3305.07 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:53:24,787 epoch 3 - iter 432/723 - loss 0.06112370 - time (sec): 32.14 - samples/sec: 3288.95 - lr: 0.000025 - momentum: 0.000000
2023-10-17 18:53:29,871 epoch 3 - iter 504/723 - loss 0.06276585 - time (sec): 37.22 - samples/sec: 3302.02 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:53:35,301 epoch 3 - iter 576/723 - loss 0.06100180 - time (sec): 42.65 - samples/sec: 3318.70 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:53:40,357 epoch 3 - iter 648/723 - loss 0.06087244 - time (sec): 47.71 - samples/sec: 3323.88 - lr: 0.000024 - momentum: 0.000000
2023-10-17 18:53:45,595 epoch 3 - iter 720/723 - loss 0.06067803 - time (sec): 52.94 - samples/sec: 3316.47 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:53:45,776 ----------------------------------------------------------------------------------------------------
2023-10-17 18:53:45,777 EPOCH 3 done: loss 0.0606 - lr: 0.000023
2023-10-17 18:53:48,994 DEV : loss 0.061015695333480835 - f1-score (micro avg)  0.8829
2023-10-17 18:53:49,010 saving best model
2023-10-17 18:53:49,436 ----------------------------------------------------------------------------------------------------
2023-10-17 18:53:54,717 epoch 4 - iter 72/723 - loss 0.04387038 - time (sec): 5.28 - samples/sec: 3489.64 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:54:00,075 epoch 4 - iter 144/723 - loss 0.03977768 - time (sec): 10.64 - samples/sec: 3434.68 - lr: 0.000023 - momentum: 0.000000
2023-10-17 18:54:05,046 epoch 4 - iter 216/723 - loss 0.04174055 - time (sec): 15.61 - samples/sec: 3394.76 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:54:10,454 epoch 4 - iter 288/723 - loss 0.04295155 - time (sec): 21.02 - samples/sec: 3371.28 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:54:15,352 epoch 4 - iter 360/723 - loss 0.04184529 - time (sec): 25.92 - samples/sec: 3370.25 - lr: 0.000022 - momentum: 0.000000
2023-10-17 18:54:20,627 epoch 4 - iter 432/723 - loss 0.04216528 - time (sec): 31.19 - samples/sec: 3358.32 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:54:25,677 epoch 4 - iter 504/723 - loss 0.04174425 - time (sec): 36.24 - samples/sec: 3385.08 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:54:31,093 epoch 4 - iter 576/723 - loss 0.04256262 - time (sec): 41.66 - samples/sec: 3366.49 - lr: 0.000021 - momentum: 0.000000
2023-10-17 18:54:36,301 epoch 4 - iter 648/723 - loss 0.04255295 - time (sec): 46.86 - samples/sec: 3360.15 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:54:41,551 epoch 4 - iter 720/723 - loss 0.04326055 - time (sec): 52.11 - samples/sec: 3372.14 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:54:41,713 ----------------------------------------------------------------------------------------------------
2023-10-17 18:54:41,713 EPOCH 4 done: loss 0.0432 - lr: 0.000020
2023-10-17 18:54:45,293 DEV : loss 0.07059507817029953 - f1-score (micro avg)  0.8652
2023-10-17 18:54:45,309 ----------------------------------------------------------------------------------------------------
2023-10-17 18:54:50,582 epoch 5 - iter 72/723 - loss 0.03765946 - time (sec): 5.27 - samples/sec: 3200.52 - lr: 0.000020 - momentum: 0.000000
2023-10-17 18:54:55,435 epoch 5 - iter 144/723 - loss 0.03376893 - time (sec): 10.12 - samples/sec: 3269.32 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:55:01,381 epoch 5 - iter 216/723 - loss 0.03444213 - time (sec): 16.07 - samples/sec: 3259.98 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:55:06,439 epoch 5 - iter 288/723 - loss 0.03201450 - time (sec): 21.13 - samples/sec: 3278.94 - lr: 0.000019 - momentum: 0.000000
2023-10-17 18:55:11,837 epoch 5 - iter 360/723 - loss 0.03021248 - time (sec): 26.53 - samples/sec: 3268.82 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:55:17,061 epoch 5 - iter 432/723 - loss 0.03095066 - time (sec): 31.75 - samples/sec: 3295.73 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:55:22,311 epoch 5 - iter 504/723 - loss 0.03205508 - time (sec): 37.00 - samples/sec: 3318.65 - lr: 0.000018 - momentum: 0.000000
2023-10-17 18:55:27,501 epoch 5 - iter 576/723 - loss 0.03251336 - time (sec): 42.19 - samples/sec: 3325.42 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:55:32,561 epoch 5 - iter 648/723 - loss 0.03276588 - time (sec): 47.25 - samples/sec: 3329.33 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:55:37,985 epoch 5 - iter 720/723 - loss 0.03238438 - time (sec): 52.67 - samples/sec: 3338.39 - lr: 0.000017 - momentum: 0.000000
2023-10-17 18:55:38,138 ----------------------------------------------------------------------------------------------------
2023-10-17 18:55:38,139 EPOCH 5 done: loss 0.0324 - lr: 0.000017
2023-10-17 18:55:41,451 DEV : loss 0.07911184430122375 - f1-score (micro avg)  0.8697
2023-10-17 18:55:41,469 ----------------------------------------------------------------------------------------------------
2023-10-17 18:55:46,855 epoch 6 - iter 72/723 - loss 0.01939646 - time (sec): 5.38 - samples/sec: 3381.89 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:55:52,097 epoch 6 - iter 144/723 - loss 0.02202295 - time (sec): 10.63 - samples/sec: 3376.79 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:55:57,312 epoch 6 - iter 216/723 - loss 0.02329081 - time (sec): 15.84 - samples/sec: 3385.10 - lr: 0.000016 - momentum: 0.000000
2023-10-17 18:56:03,153 epoch 6 - iter 288/723 - loss 0.02709437 - time (sec): 21.68 - samples/sec: 3284.61 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:56:08,569 epoch 6 - iter 360/723 - loss 0.02852147 - time (sec): 27.10 - samples/sec: 3309.49 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:56:13,824 epoch 6 - iter 432/723 - loss 0.02704669 - time (sec): 32.35 - samples/sec: 3322.19 - lr: 0.000015 - momentum: 0.000000
2023-10-17 18:56:18,846 epoch 6 - iter 504/723 - loss 0.02698341 - time (sec): 37.38 - samples/sec: 3337.29 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:56:23,707 epoch 6 - iter 576/723 - loss 0.02713054 - time (sec): 42.24 - samples/sec: 3344.91 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:56:28,856 epoch 6 - iter 648/723 - loss 0.02638017 - time (sec): 47.38 - samples/sec: 3348.51 - lr: 0.000014 - momentum: 0.000000
2023-10-17 18:56:33,896 epoch 6 - iter 720/723 - loss 0.02633353 - time (sec): 52.43 - samples/sec: 3352.29 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:56:34,069 ----------------------------------------------------------------------------------------------------
2023-10-17 18:56:34,069 EPOCH 6 done: loss 0.0263 - lr: 0.000013
2023-10-17 18:56:37,242 DEV : loss 0.08742444217205048 - f1-score (micro avg)  0.8809
2023-10-17 18:56:37,259 ----------------------------------------------------------------------------------------------------
2023-10-17 18:56:42,537 epoch 7 - iter 72/723 - loss 0.01010414 - time (sec): 5.28 - samples/sec: 3350.03 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:56:47,589 epoch 7 - iter 144/723 - loss 0.01933338 - time (sec): 10.33 - samples/sec: 3321.61 - lr: 0.000013 - momentum: 0.000000
2023-10-17 18:56:53,263 epoch 7 - iter 216/723 - loss 0.01860946 - time (sec): 16.00 - samples/sec: 3316.41 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:56:58,767 epoch 7 - iter 288/723 - loss 0.01987477 - time (sec): 21.51 - samples/sec: 3329.15 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:57:04,192 epoch 7 - iter 360/723 - loss 0.01978143 - time (sec): 26.93 - samples/sec: 3325.96 - lr: 0.000012 - momentum: 0.000000
2023-10-17 18:57:09,642 epoch 7 - iter 432/723 - loss 0.02023814 - time (sec): 32.38 - samples/sec: 3310.10 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:57:14,799 epoch 7 - iter 504/723 - loss 0.01949429 - time (sec): 37.54 - samples/sec: 3315.96 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:57:19,826 epoch 7 - iter 576/723 - loss 0.01865648 - time (sec): 42.57 - samples/sec: 3327.50 - lr: 0.000011 - momentum: 0.000000
2023-10-17 18:57:24,812 epoch 7 - iter 648/723 - loss 0.01853140 - time (sec): 47.55 - samples/sec: 3333.24 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:57:30,046 epoch 7 - iter 720/723 - loss 0.01839669 - time (sec): 52.79 - samples/sec: 3328.86 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:57:30,201 ----------------------------------------------------------------------------------------------------
2023-10-17 18:57:30,201 EPOCH 7 done: loss 0.0184 - lr: 0.000010
2023-10-17 18:57:33,757 DEV : loss 0.10578546673059464 - f1-score (micro avg)  0.8809
2023-10-17 18:57:33,774 ----------------------------------------------------------------------------------------------------
2023-10-17 18:57:38,932 epoch 8 - iter 72/723 - loss 0.00841995 - time (sec): 5.16 - samples/sec: 3443.43 - lr: 0.000010 - momentum: 0.000000
2023-10-17 18:57:44,074 epoch 8 - iter 144/723 - loss 0.01229420 - time (sec): 10.30 - samples/sec: 3426.89 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:57:49,096 epoch 8 - iter 216/723 - loss 0.01355410 - time (sec): 15.32 - samples/sec: 3395.51 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:57:54,305 epoch 8 - iter 288/723 - loss 0.01394882 - time (sec): 20.53 - samples/sec: 3385.90 - lr: 0.000009 - momentum: 0.000000
2023-10-17 18:57:59,323 epoch 8 - iter 360/723 - loss 0.01323447 - time (sec): 25.55 - samples/sec: 3374.12 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:58:04,498 epoch 8 - iter 432/723 - loss 0.01270974 - time (sec): 30.72 - samples/sec: 3377.16 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:58:09,812 epoch 8 - iter 504/723 - loss 0.01255571 - time (sec): 36.04 - samples/sec: 3355.95 - lr: 0.000008 - momentum: 0.000000
2023-10-17 18:58:15,497 epoch 8 - iter 576/723 - loss 0.01341256 - time (sec): 41.72 - samples/sec: 3360.99 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:58:20,693 epoch 8 - iter 648/723 - loss 0.01371160 - time (sec): 46.92 - samples/sec: 3357.01 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:58:26,278 epoch 8 - iter 720/723 - loss 0.01394702 - time (sec): 52.50 - samples/sec: 3344.06 - lr: 0.000007 - momentum: 0.000000
2023-10-17 18:58:26,474 ----------------------------------------------------------------------------------------------------
2023-10-17 18:58:26,474 EPOCH 8 done: loss 0.0139 - lr: 0.000007
2023-10-17 18:58:29,679 DEV : loss 0.11371435225009918 - f1-score (micro avg)  0.8805
2023-10-17 18:58:29,695 ----------------------------------------------------------------------------------------------------
2023-10-17 18:58:35,055 epoch 9 - iter 72/723 - loss 0.01003235 - time (sec): 5.36 - samples/sec: 3286.54 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:58:40,336 epoch 9 - iter 144/723 - loss 0.00981857 - time (sec): 10.64 - samples/sec: 3403.11 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:58:45,118 epoch 9 - iter 216/723 - loss 0.01076096 - time (sec): 15.42 - samples/sec: 3432.99 - lr: 0.000006 - momentum: 0.000000
2023-10-17 18:58:50,044 epoch 9 - iter 288/723 - loss 0.01027158 - time (sec): 20.35 - samples/sec: 3464.18 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:58:55,521 epoch 9 - iter 360/723 - loss 0.00989332 - time (sec): 25.82 - samples/sec: 3429.30 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:59:00,544 epoch 9 - iter 432/723 - loss 0.00992623 - time (sec): 30.85 - samples/sec: 3432.33 - lr: 0.000005 - momentum: 0.000000
2023-10-17 18:59:06,392 epoch 9 - iter 504/723 - loss 0.01080183 - time (sec): 36.70 - samples/sec: 3398.02 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:59:11,482 epoch 9 - iter 576/723 - loss 0.01069467 - time (sec): 41.79 - samples/sec: 3391.20 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:59:16,665 epoch 9 - iter 648/723 - loss 0.01087289 - time (sec): 46.97 - samples/sec: 3403.56 - lr: 0.000004 - momentum: 0.000000
2023-10-17 18:59:21,382 epoch 9 - iter 720/723 - loss 0.01170845 - time (sec): 51.69 - samples/sec: 3401.34 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:59:21,537 ----------------------------------------------------------------------------------------------------
2023-10-17 18:59:21,537 EPOCH 9 done: loss 0.0117 - lr: 0.000003
2023-10-17 18:59:24,756 DEV : loss 0.11608566343784332 - f1-score (micro avg)  0.8813
2023-10-17 18:59:24,773 ----------------------------------------------------------------------------------------------------
2023-10-17 18:59:30,213 epoch 10 - iter 72/723 - loss 0.01498393 - time (sec): 5.44 - samples/sec: 3305.97 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:59:35,092 epoch 10 - iter 144/723 - loss 0.00950962 - time (sec): 10.32 - samples/sec: 3395.52 - lr: 0.000003 - momentum: 0.000000
2023-10-17 18:59:40,454 epoch 10 - iter 216/723 - loss 0.00882272 - time (sec): 15.68 - samples/sec: 3373.59 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:59:45,853 epoch 10 - iter 288/723 - loss 0.00879424 - time (sec): 21.08 - samples/sec: 3350.52 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:59:50,927 epoch 10 - iter 360/723 - loss 0.00903110 - time (sec): 26.15 - samples/sec: 3359.20 - lr: 0.000002 - momentum: 0.000000
2023-10-17 18:59:56,386 epoch 10 - iter 432/723 - loss 0.00841109 - time (sec): 31.61 - samples/sec: 3355.32 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:00:01,775 epoch 10 - iter 504/723 - loss 0.00806376 - time (sec): 37.00 - samples/sec: 3325.43 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:00:06,842 epoch 10 - iter 576/723 - loss 0.00790310 - time (sec): 42.07 - samples/sec: 3324.90 - lr: 0.000001 - momentum: 0.000000
2023-10-17 19:00:12,025 epoch 10 - iter 648/723 - loss 0.00804525 - time (sec): 47.25 - samples/sec: 3339.23 - lr: 0.000000 - momentum: 0.000000
2023-10-17 19:00:17,383 epoch 10 - iter 720/723 - loss 0.00796014 - time (sec): 52.61 - samples/sec: 3342.17 - lr: 0.000000 - momentum: 0.000000
2023-10-17 19:00:17,532 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:17,533 EPOCH 10 done: loss 0.0079 - lr: 0.000000
2023-10-17 19:00:21,687 DEV : loss 0.12145841866731644 - f1-score (micro avg)  0.8792
2023-10-17 19:00:22,119 ----------------------------------------------------------------------------------------------------
2023-10-17 19:00:22,121 Loading model from best epoch ...
2023-10-17 19:00:23,863 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 19:00:27,596 
Results:
- F-score (micro) 0.8643
- F-score (macro) 0.7231
- Accuracy 0.7673

By class:
              precision    recall  f1-score   support

         PER     0.8669    0.8651    0.8660       482
         LOC     0.9509    0.8886    0.9187       458
         ORG     0.5714    0.2899    0.3846        69

   micro avg     0.8941    0.8365    0.8643      1009
   macro avg     0.7964    0.6812    0.7231      1009
weighted avg     0.8849    0.8365    0.8570      1009

2023-10-17 19:00:27,596 ----------------------------------------------------------------------------------------------------