File size: 23,736 Bytes
f2e9927
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
2023-10-13 21:37:23,966 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:23,967 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(32001, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=13, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-13 21:37:23,967 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:23,967 MultiCorpus: 7936 train + 992 dev + 992 test sentences
 - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-13 21:37:23,967 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:23,967 Train:  7936 sentences
2023-10-13 21:37:23,967         (train_with_dev=False, train_with_test=False)
2023-10-13 21:37:23,967 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:23,967 Training Params:
2023-10-13 21:37:23,967  - learning_rate: "5e-05" 
2023-10-13 21:37:23,968  - mini_batch_size: "8"
2023-10-13 21:37:23,968  - max_epochs: "10"
2023-10-13 21:37:23,968  - shuffle: "True"
2023-10-13 21:37:23,968 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:23,968 Plugins:
2023-10-13 21:37:23,968  - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 21:37:23,968 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:23,968 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 21:37:23,968  - metric: "('micro avg', 'f1-score')"
2023-10-13 21:37:23,968 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:23,968 Computation:
2023-10-13 21:37:23,968  - compute on device: cuda:0
2023-10-13 21:37:23,968  - embedding storage: none
2023-10-13 21:37:23,968 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:23,968 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-13 21:37:23,968 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:23,968 ----------------------------------------------------------------------------------------------------
2023-10-13 21:37:29,890 epoch 1 - iter 99/992 - loss 1.91261312 - time (sec): 5.92 - samples/sec: 2715.74 - lr: 0.000005 - momentum: 0.000000
2023-10-13 21:37:35,908 epoch 1 - iter 198/992 - loss 1.13445665 - time (sec): 11.94 - samples/sec: 2727.88 - lr: 0.000010 - momentum: 0.000000
2023-10-13 21:37:41,742 epoch 1 - iter 297/992 - loss 0.83788670 - time (sec): 17.77 - samples/sec: 2767.01 - lr: 0.000015 - momentum: 0.000000
2023-10-13 21:37:47,441 epoch 1 - iter 396/992 - loss 0.67694574 - time (sec): 23.47 - samples/sec: 2781.96 - lr: 0.000020 - momentum: 0.000000
2023-10-13 21:37:53,157 epoch 1 - iter 495/992 - loss 0.57510064 - time (sec): 29.19 - samples/sec: 2789.19 - lr: 0.000025 - momentum: 0.000000
2023-10-13 21:37:58,914 epoch 1 - iter 594/992 - loss 0.50275111 - time (sec): 34.95 - samples/sec: 2791.76 - lr: 0.000030 - momentum: 0.000000
2023-10-13 21:38:04,772 epoch 1 - iter 693/992 - loss 0.44880423 - time (sec): 40.80 - samples/sec: 2810.61 - lr: 0.000035 - momentum: 0.000000
2023-10-13 21:38:11,079 epoch 1 - iter 792/992 - loss 0.40684356 - time (sec): 47.11 - samples/sec: 2805.49 - lr: 0.000040 - momentum: 0.000000
2023-10-13 21:38:16,874 epoch 1 - iter 891/992 - loss 0.37820026 - time (sec): 52.90 - samples/sec: 2799.25 - lr: 0.000045 - momentum: 0.000000
2023-10-13 21:38:22,677 epoch 1 - iter 990/992 - loss 0.35457753 - time (sec): 58.71 - samples/sec: 2789.75 - lr: 0.000050 - momentum: 0.000000
2023-10-13 21:38:22,786 ----------------------------------------------------------------------------------------------------
2023-10-13 21:38:22,786 EPOCH 1 done: loss 0.3542 - lr: 0.000050
2023-10-13 21:38:26,277 DEV : loss 0.09475447982549667 - f1-score (micro avg)  0.6958
2023-10-13 21:38:26,298 saving best model
2023-10-13 21:38:26,721 ----------------------------------------------------------------------------------------------------
2023-10-13 21:38:32,814 epoch 2 - iter 99/992 - loss 0.12012769 - time (sec): 6.09 - samples/sec: 2833.35 - lr: 0.000049 - momentum: 0.000000
2023-10-13 21:38:38,526 epoch 2 - iter 198/992 - loss 0.11631000 - time (sec): 11.80 - samples/sec: 2746.93 - lr: 0.000049 - momentum: 0.000000
2023-10-13 21:38:44,938 epoch 2 - iter 297/992 - loss 0.11145763 - time (sec): 18.22 - samples/sec: 2735.81 - lr: 0.000048 - momentum: 0.000000
2023-10-13 21:38:50,662 epoch 2 - iter 396/992 - loss 0.10881005 - time (sec): 23.94 - samples/sec: 2704.74 - lr: 0.000048 - momentum: 0.000000
2023-10-13 21:38:56,694 epoch 2 - iter 495/992 - loss 0.10846778 - time (sec): 29.97 - samples/sec: 2740.68 - lr: 0.000047 - momentum: 0.000000
2023-10-13 21:39:02,441 epoch 2 - iter 594/992 - loss 0.10567866 - time (sec): 35.72 - samples/sec: 2746.65 - lr: 0.000047 - momentum: 0.000000
2023-10-13 21:39:08,610 epoch 2 - iter 693/992 - loss 0.10559153 - time (sec): 41.89 - samples/sec: 2735.81 - lr: 0.000046 - momentum: 0.000000
2023-10-13 21:39:14,672 epoch 2 - iter 792/992 - loss 0.10488835 - time (sec): 47.95 - samples/sec: 2722.53 - lr: 0.000046 - momentum: 0.000000
2023-10-13 21:39:20,740 epoch 2 - iter 891/992 - loss 0.10442316 - time (sec): 54.02 - samples/sec: 2727.03 - lr: 0.000045 - momentum: 0.000000
2023-10-13 21:39:26,803 epoch 2 - iter 990/992 - loss 0.10329774 - time (sec): 60.08 - samples/sec: 2725.62 - lr: 0.000044 - momentum: 0.000000
2023-10-13 21:39:26,916 ----------------------------------------------------------------------------------------------------
2023-10-13 21:39:26,916 EPOCH 2 done: loss 0.1033 - lr: 0.000044
2023-10-13 21:39:30,341 DEV : loss 0.08205121755599976 - f1-score (micro avg)  0.7129
2023-10-13 21:39:30,361 saving best model
2023-10-13 21:39:30,861 ----------------------------------------------------------------------------------------------------
2023-10-13 21:39:36,742 epoch 3 - iter 99/992 - loss 0.07082610 - time (sec): 5.87 - samples/sec: 2789.48 - lr: 0.000044 - momentum: 0.000000
2023-10-13 21:39:42,620 epoch 3 - iter 198/992 - loss 0.06839682 - time (sec): 11.75 - samples/sec: 2713.38 - lr: 0.000043 - momentum: 0.000000
2023-10-13 21:39:48,758 epoch 3 - iter 297/992 - loss 0.06531227 - time (sec): 17.89 - samples/sec: 2763.77 - lr: 0.000043 - momentum: 0.000000
2023-10-13 21:39:54,864 epoch 3 - iter 396/992 - loss 0.06974993 - time (sec): 23.99 - samples/sec: 2777.66 - lr: 0.000042 - momentum: 0.000000
2023-10-13 21:40:00,891 epoch 3 - iter 495/992 - loss 0.07037521 - time (sec): 30.02 - samples/sec: 2759.70 - lr: 0.000042 - momentum: 0.000000
2023-10-13 21:40:06,636 epoch 3 - iter 594/992 - loss 0.07072352 - time (sec): 35.77 - samples/sec: 2768.83 - lr: 0.000041 - momentum: 0.000000
2023-10-13 21:40:12,505 epoch 3 - iter 693/992 - loss 0.07026764 - time (sec): 41.64 - samples/sec: 2761.90 - lr: 0.000041 - momentum: 0.000000
2023-10-13 21:40:18,247 epoch 3 - iter 792/992 - loss 0.07260697 - time (sec): 47.38 - samples/sec: 2775.98 - lr: 0.000040 - momentum: 0.000000
2023-10-13 21:40:24,041 epoch 3 - iter 891/992 - loss 0.07256251 - time (sec): 53.17 - samples/sec: 2778.82 - lr: 0.000039 - momentum: 0.000000
2023-10-13 21:40:29,699 epoch 3 - iter 990/992 - loss 0.07225091 - time (sec): 58.83 - samples/sec: 2780.17 - lr: 0.000039 - momentum: 0.000000
2023-10-13 21:40:29,815 ----------------------------------------------------------------------------------------------------
2023-10-13 21:40:29,815 EPOCH 3 done: loss 0.0722 - lr: 0.000039
2023-10-13 21:40:33,902 DEV : loss 0.10864556580781937 - f1-score (micro avg)  0.758
2023-10-13 21:40:33,939 saving best model
2023-10-13 21:40:34,466 ----------------------------------------------------------------------------------------------------
2023-10-13 21:40:40,380 epoch 4 - iter 99/992 - loss 0.05150709 - time (sec): 5.91 - samples/sec: 2796.73 - lr: 0.000038 - momentum: 0.000000
2023-10-13 21:40:46,332 epoch 4 - iter 198/992 - loss 0.04958120 - time (sec): 11.86 - samples/sec: 2802.33 - lr: 0.000038 - momentum: 0.000000
2023-10-13 21:40:51,999 epoch 4 - iter 297/992 - loss 0.05125257 - time (sec): 17.53 - samples/sec: 2798.55 - lr: 0.000037 - momentum: 0.000000
2023-10-13 21:40:57,854 epoch 4 - iter 396/992 - loss 0.04922105 - time (sec): 23.39 - samples/sec: 2782.53 - lr: 0.000037 - momentum: 0.000000
2023-10-13 21:41:03,725 epoch 4 - iter 495/992 - loss 0.04834639 - time (sec): 29.26 - samples/sec: 2783.19 - lr: 0.000036 - momentum: 0.000000
2023-10-13 21:41:09,534 epoch 4 - iter 594/992 - loss 0.04924934 - time (sec): 35.07 - samples/sec: 2786.12 - lr: 0.000036 - momentum: 0.000000
2023-10-13 21:41:15,489 epoch 4 - iter 693/992 - loss 0.04948734 - time (sec): 41.02 - samples/sec: 2778.64 - lr: 0.000035 - momentum: 0.000000
2023-10-13 21:41:21,333 epoch 4 - iter 792/992 - loss 0.04908108 - time (sec): 46.87 - samples/sec: 2773.09 - lr: 0.000034 - momentum: 0.000000
2023-10-13 21:41:27,452 epoch 4 - iter 891/992 - loss 0.04886483 - time (sec): 52.98 - samples/sec: 2765.98 - lr: 0.000034 - momentum: 0.000000
2023-10-13 21:41:33,681 epoch 4 - iter 990/992 - loss 0.04998705 - time (sec): 59.21 - samples/sec: 2765.34 - lr: 0.000033 - momentum: 0.000000
2023-10-13 21:41:33,794 ----------------------------------------------------------------------------------------------------
2023-10-13 21:41:33,794 EPOCH 4 done: loss 0.0499 - lr: 0.000033
2023-10-13 21:41:37,201 DEV : loss 0.11597025394439697 - f1-score (micro avg)  0.7675
2023-10-13 21:41:37,222 saving best model
2023-10-13 21:41:37,774 ----------------------------------------------------------------------------------------------------
2023-10-13 21:41:43,640 epoch 5 - iter 99/992 - loss 0.04977178 - time (sec): 5.86 - samples/sec: 2838.67 - lr: 0.000033 - momentum: 0.000000
2023-10-13 21:41:49,509 epoch 5 - iter 198/992 - loss 0.04238345 - time (sec): 11.73 - samples/sec: 2809.15 - lr: 0.000032 - momentum: 0.000000
2023-10-13 21:41:55,313 epoch 5 - iter 297/992 - loss 0.04044960 - time (sec): 17.53 - samples/sec: 2809.68 - lr: 0.000032 - momentum: 0.000000
2023-10-13 21:42:01,758 epoch 5 - iter 396/992 - loss 0.03908575 - time (sec): 23.98 - samples/sec: 2780.41 - lr: 0.000031 - momentum: 0.000000
2023-10-13 21:42:07,657 epoch 5 - iter 495/992 - loss 0.03930276 - time (sec): 29.88 - samples/sec: 2784.79 - lr: 0.000031 - momentum: 0.000000
2023-10-13 21:42:13,458 epoch 5 - iter 594/992 - loss 0.03815672 - time (sec): 35.68 - samples/sec: 2781.01 - lr: 0.000030 - momentum: 0.000000
2023-10-13 21:42:19,292 epoch 5 - iter 693/992 - loss 0.03945281 - time (sec): 41.51 - samples/sec: 2772.88 - lr: 0.000029 - momentum: 0.000000
2023-10-13 21:42:24,988 epoch 5 - iter 792/992 - loss 0.03857818 - time (sec): 47.21 - samples/sec: 2787.22 - lr: 0.000029 - momentum: 0.000000
2023-10-13 21:42:31,007 epoch 5 - iter 891/992 - loss 0.03988100 - time (sec): 53.23 - samples/sec: 2781.52 - lr: 0.000028 - momentum: 0.000000
2023-10-13 21:42:36,766 epoch 5 - iter 990/992 - loss 0.04081192 - time (sec): 58.99 - samples/sec: 2773.33 - lr: 0.000028 - momentum: 0.000000
2023-10-13 21:42:36,903 ----------------------------------------------------------------------------------------------------
2023-10-13 21:42:36,903 EPOCH 5 done: loss 0.0408 - lr: 0.000028
2023-10-13 21:42:41,542 DEV : loss 0.13763324916362762 - f1-score (micro avg)  0.7589
2023-10-13 21:42:41,580 ----------------------------------------------------------------------------------------------------
2023-10-13 21:42:47,750 epoch 6 - iter 99/992 - loss 0.02422172 - time (sec): 6.17 - samples/sec: 2688.98 - lr: 0.000027 - momentum: 0.000000
2023-10-13 21:42:53,863 epoch 6 - iter 198/992 - loss 0.02863550 - time (sec): 12.28 - samples/sec: 2753.55 - lr: 0.000027 - momentum: 0.000000
2023-10-13 21:42:59,675 epoch 6 - iter 297/992 - loss 0.02988685 - time (sec): 18.09 - samples/sec: 2746.20 - lr: 0.000026 - momentum: 0.000000
2023-10-13 21:43:05,663 epoch 6 - iter 396/992 - loss 0.02833903 - time (sec): 24.08 - samples/sec: 2722.44 - lr: 0.000026 - momentum: 0.000000
2023-10-13 21:43:11,508 epoch 6 - iter 495/992 - loss 0.02786454 - time (sec): 29.93 - samples/sec: 2737.83 - lr: 0.000025 - momentum: 0.000000
2023-10-13 21:43:17,288 epoch 6 - iter 594/992 - loss 0.02866839 - time (sec): 35.71 - samples/sec: 2755.15 - lr: 0.000024 - momentum: 0.000000
2023-10-13 21:43:23,408 epoch 6 - iter 693/992 - loss 0.02806587 - time (sec): 41.83 - samples/sec: 2753.01 - lr: 0.000024 - momentum: 0.000000
2023-10-13 21:43:29,273 epoch 6 - iter 792/992 - loss 0.02884132 - time (sec): 47.69 - samples/sec: 2755.88 - lr: 0.000023 - momentum: 0.000000
2023-10-13 21:43:35,118 epoch 6 - iter 891/992 - loss 0.02883868 - time (sec): 53.54 - samples/sec: 2753.63 - lr: 0.000023 - momentum: 0.000000
2023-10-13 21:43:41,138 epoch 6 - iter 990/992 - loss 0.02863755 - time (sec): 59.56 - samples/sec: 2748.21 - lr: 0.000022 - momentum: 0.000000
2023-10-13 21:43:41,250 ----------------------------------------------------------------------------------------------------
2023-10-13 21:43:41,250 EPOCH 6 done: loss 0.0288 - lr: 0.000022
2023-10-13 21:43:44,796 DEV : loss 0.17916934192180634 - f1-score (micro avg)  0.7563
2023-10-13 21:43:44,821 ----------------------------------------------------------------------------------------------------
2023-10-13 21:43:50,755 epoch 7 - iter 99/992 - loss 0.01996214 - time (sec): 5.93 - samples/sec: 2590.21 - lr: 0.000022 - momentum: 0.000000
2023-10-13 21:43:57,458 epoch 7 - iter 198/992 - loss 0.02277291 - time (sec): 12.63 - samples/sec: 2548.91 - lr: 0.000021 - momentum: 0.000000
2023-10-13 21:44:03,311 epoch 7 - iter 297/992 - loss 0.02180867 - time (sec): 18.49 - samples/sec: 2584.45 - lr: 0.000021 - momentum: 0.000000
2023-10-13 21:44:09,225 epoch 7 - iter 396/992 - loss 0.02147002 - time (sec): 24.40 - samples/sec: 2621.14 - lr: 0.000020 - momentum: 0.000000
2023-10-13 21:44:15,345 epoch 7 - iter 495/992 - loss 0.02147281 - time (sec): 30.52 - samples/sec: 2658.70 - lr: 0.000019 - momentum: 0.000000
2023-10-13 21:44:21,676 epoch 7 - iter 594/992 - loss 0.02278340 - time (sec): 36.85 - samples/sec: 2672.02 - lr: 0.000019 - momentum: 0.000000
2023-10-13 21:44:27,314 epoch 7 - iter 693/992 - loss 0.02302071 - time (sec): 42.49 - samples/sec: 2675.36 - lr: 0.000018 - momentum: 0.000000
2023-10-13 21:44:33,289 epoch 7 - iter 792/992 - loss 0.02238297 - time (sec): 48.47 - samples/sec: 2696.36 - lr: 0.000018 - momentum: 0.000000
2023-10-13 21:44:38,932 epoch 7 - iter 891/992 - loss 0.02194504 - time (sec): 54.11 - samples/sec: 2717.35 - lr: 0.000017 - momentum: 0.000000
2023-10-13 21:44:44,806 epoch 7 - iter 990/992 - loss 0.02178032 - time (sec): 59.98 - samples/sec: 2727.36 - lr: 0.000017 - momentum: 0.000000
2023-10-13 21:44:44,942 ----------------------------------------------------------------------------------------------------
2023-10-13 21:44:44,942 EPOCH 7 done: loss 0.0217 - lr: 0.000017
2023-10-13 21:44:48,386 DEV : loss 0.19963695108890533 - f1-score (micro avg)  0.7511
2023-10-13 21:44:48,411 ----------------------------------------------------------------------------------------------------
2023-10-13 21:44:54,512 epoch 8 - iter 99/992 - loss 0.00802566 - time (sec): 6.10 - samples/sec: 2678.43 - lr: 0.000016 - momentum: 0.000000
2023-10-13 21:45:00,449 epoch 8 - iter 198/992 - loss 0.01380324 - time (sec): 12.04 - samples/sec: 2738.01 - lr: 0.000016 - momentum: 0.000000
2023-10-13 21:45:06,122 epoch 8 - iter 297/992 - loss 0.01227669 - time (sec): 17.71 - samples/sec: 2757.67 - lr: 0.000015 - momentum: 0.000000
2023-10-13 21:45:12,518 epoch 8 - iter 396/992 - loss 0.01400552 - time (sec): 24.11 - samples/sec: 2729.93 - lr: 0.000014 - momentum: 0.000000
2023-10-13 21:45:18,333 epoch 8 - iter 495/992 - loss 0.01475903 - time (sec): 29.92 - samples/sec: 2736.32 - lr: 0.000014 - momentum: 0.000000
2023-10-13 21:45:23,984 epoch 8 - iter 594/992 - loss 0.01530820 - time (sec): 35.57 - samples/sec: 2735.01 - lr: 0.000013 - momentum: 0.000000
2023-10-13 21:45:30,069 epoch 8 - iter 693/992 - loss 0.01589983 - time (sec): 41.66 - samples/sec: 2736.91 - lr: 0.000013 - momentum: 0.000000
2023-10-13 21:45:36,359 epoch 8 - iter 792/992 - loss 0.01603725 - time (sec): 47.95 - samples/sec: 2744.42 - lr: 0.000012 - momentum: 0.000000
2023-10-13 21:45:42,042 epoch 8 - iter 891/992 - loss 0.01533557 - time (sec): 53.63 - samples/sec: 2749.62 - lr: 0.000012 - momentum: 0.000000
2023-10-13 21:45:47,922 epoch 8 - iter 990/992 - loss 0.01568211 - time (sec): 59.51 - samples/sec: 2751.39 - lr: 0.000011 - momentum: 0.000000
2023-10-13 21:45:48,031 ----------------------------------------------------------------------------------------------------
2023-10-13 21:45:48,031 EPOCH 8 done: loss 0.0157 - lr: 0.000011
2023-10-13 21:45:51,538 DEV : loss 0.2106373906135559 - f1-score (micro avg)  0.7511
2023-10-13 21:45:51,560 ----------------------------------------------------------------------------------------------------
2023-10-13 21:45:57,529 epoch 9 - iter 99/992 - loss 0.00887012 - time (sec): 5.97 - samples/sec: 2806.18 - lr: 0.000011 - momentum: 0.000000
2023-10-13 21:46:03,489 epoch 9 - iter 198/992 - loss 0.00899000 - time (sec): 11.93 - samples/sec: 2748.69 - lr: 0.000010 - momentum: 0.000000
2023-10-13 21:46:09,616 epoch 9 - iter 297/992 - loss 0.00742884 - time (sec): 18.05 - samples/sec: 2694.16 - lr: 0.000009 - momentum: 0.000000
2023-10-13 21:46:15,833 epoch 9 - iter 396/992 - loss 0.00794530 - time (sec): 24.27 - samples/sec: 2710.07 - lr: 0.000009 - momentum: 0.000000
2023-10-13 21:46:21,629 epoch 9 - iter 495/992 - loss 0.00848285 - time (sec): 30.07 - samples/sec: 2744.72 - lr: 0.000008 - momentum: 0.000000
2023-10-13 21:46:27,363 epoch 9 - iter 594/992 - loss 0.00917051 - time (sec): 35.80 - samples/sec: 2739.94 - lr: 0.000008 - momentum: 0.000000
2023-10-13 21:46:32,985 epoch 9 - iter 693/992 - loss 0.00939071 - time (sec): 41.42 - samples/sec: 2755.95 - lr: 0.000007 - momentum: 0.000000
2023-10-13 21:46:38,976 epoch 9 - iter 792/992 - loss 0.01010125 - time (sec): 47.41 - samples/sec: 2762.85 - lr: 0.000007 - momentum: 0.000000
2023-10-13 21:46:45,090 epoch 9 - iter 891/992 - loss 0.01075447 - time (sec): 53.53 - samples/sec: 2761.88 - lr: 0.000006 - momentum: 0.000000
2023-10-13 21:46:50,883 epoch 9 - iter 990/992 - loss 0.01068750 - time (sec): 59.32 - samples/sec: 2757.49 - lr: 0.000006 - momentum: 0.000000
2023-10-13 21:46:51,018 ----------------------------------------------------------------------------------------------------
2023-10-13 21:46:51,018 EPOCH 9 done: loss 0.0107 - lr: 0.000006
2023-10-13 21:46:54,559 DEV : loss 0.22137963771820068 - f1-score (micro avg)  0.7493
2023-10-13 21:46:54,581 ----------------------------------------------------------------------------------------------------
2023-10-13 21:47:00,669 epoch 10 - iter 99/992 - loss 0.00885115 - time (sec): 6.09 - samples/sec: 2800.45 - lr: 0.000005 - momentum: 0.000000
2023-10-13 21:47:06,567 epoch 10 - iter 198/992 - loss 0.00701261 - time (sec): 11.98 - samples/sec: 2734.37 - lr: 0.000004 - momentum: 0.000000
2023-10-13 21:47:12,490 epoch 10 - iter 297/992 - loss 0.00709994 - time (sec): 17.91 - samples/sec: 2725.21 - lr: 0.000004 - momentum: 0.000000
2023-10-13 21:47:19,212 epoch 10 - iter 396/992 - loss 0.00713180 - time (sec): 24.63 - samples/sec: 2657.26 - lr: 0.000003 - momentum: 0.000000
2023-10-13 21:47:24,800 epoch 10 - iter 495/992 - loss 0.00712523 - time (sec): 30.22 - samples/sec: 2696.43 - lr: 0.000003 - momentum: 0.000000
2023-10-13 21:47:30,657 epoch 10 - iter 594/992 - loss 0.00752130 - time (sec): 36.07 - samples/sec: 2715.70 - lr: 0.000002 - momentum: 0.000000
2023-10-13 21:47:36,606 epoch 10 - iter 693/992 - loss 0.00756143 - time (sec): 42.02 - samples/sec: 2720.45 - lr: 0.000002 - momentum: 0.000000
2023-10-13 21:47:42,502 epoch 10 - iter 792/992 - loss 0.00745144 - time (sec): 47.92 - samples/sec: 2732.37 - lr: 0.000001 - momentum: 0.000000
2023-10-13 21:47:48,499 epoch 10 - iter 891/992 - loss 0.00734368 - time (sec): 53.92 - samples/sec: 2744.99 - lr: 0.000001 - momentum: 0.000000
2023-10-13 21:47:54,258 epoch 10 - iter 990/992 - loss 0.00723729 - time (sec): 59.68 - samples/sec: 2742.25 - lr: 0.000000 - momentum: 0.000000
2023-10-13 21:47:54,381 ----------------------------------------------------------------------------------------------------
2023-10-13 21:47:54,381 EPOCH 10 done: loss 0.0072 - lr: 0.000000
2023-10-13 21:47:57,909 DEV : loss 0.22665657103061676 - f1-score (micro avg)  0.7513
2023-10-13 21:47:58,383 ----------------------------------------------------------------------------------------------------
2023-10-13 21:47:58,384 Loading model from best epoch ...
2023-10-13 21:47:59,826 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-13 21:48:03,320 
Results:
- F-score (micro) 0.7736
- F-score (macro) 0.6613
- Accuracy 0.6522

By class:
              precision    recall  f1-score   support

         LOC     0.7826    0.8794    0.8282       655
         PER     0.8556    0.6906    0.7643       223
         ORG     0.5968    0.2913    0.3915       127

   micro avg     0.7843    0.7632    0.7736      1005
   macro avg     0.7450    0.6204    0.6613      1005
weighted avg     0.7753    0.7632    0.7588      1005

2023-10-13 21:48:03,320 ----------------------------------------------------------------------------------------------------