File size: 24,154 Bytes
70e96e2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
2023-10-17 13:42:53,754 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,755 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): ElectraModel(
      (embeddings): ElectraEmbeddings(
        (word_embeddings): Embedding(32001, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): ElectraEncoder(
        (layer): ModuleList(
          (0-11): 12 x ElectraLayer(
            (attention): ElectraAttention(
              (self): ElectraSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): ElectraSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): ElectraIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): ElectraOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=13, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-17 13:42:53,755 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,755 MultiCorpus: 7936 train + 992 dev + 992 test sentences
 - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Train:  7936 sentences
2023-10-17 13:42:53,756         (train_with_dev=False, train_with_test=False)
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Training Params:
2023-10-17 13:42:53,756  - learning_rate: "5e-05" 
2023-10-17 13:42:53,756  - mini_batch_size: "4"
2023-10-17 13:42:53,756  - max_epochs: "10"
2023-10-17 13:42:53,756  - shuffle: "True"
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Plugins:
2023-10-17 13:42:53,756  - TensorboardLogger
2023-10-17 13:42:53,756  - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 13:42:53,756  - metric: "('micro avg', 'f1-score')"
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Computation:
2023-10-17 13:42:53,756  - compute on device: cuda:0
2023-10-17 13:42:53,756  - embedding storage: none
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 ----------------------------------------------------------------------------------------------------
2023-10-17 13:42:53,756 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 13:43:02,902 epoch 1 - iter 198/1984 - loss 1.99194816 - time (sec): 9.14 - samples/sec: 1812.16 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:43:12,040 epoch 1 - iter 396/1984 - loss 1.12529108 - time (sec): 18.28 - samples/sec: 1834.45 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:43:20,571 epoch 1 - iter 594/1984 - loss 0.84075243 - time (sec): 26.81 - samples/sec: 1828.97 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:43:29,636 epoch 1 - iter 792/1984 - loss 0.67948072 - time (sec): 35.88 - samples/sec: 1809.73 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:43:38,961 epoch 1 - iter 990/1984 - loss 0.56666412 - time (sec): 45.20 - samples/sec: 1828.99 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:43:47,711 epoch 1 - iter 1188/1984 - loss 0.50179248 - time (sec): 53.95 - samples/sec: 1829.62 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:43:56,331 epoch 1 - iter 1386/1984 - loss 0.45691609 - time (sec): 62.57 - samples/sec: 1836.93 - lr: 0.000035 - momentum: 0.000000
2023-10-17 13:44:05,242 epoch 1 - iter 1584/1984 - loss 0.41865425 - time (sec): 71.48 - samples/sec: 1841.02 - lr: 0.000040 - momentum: 0.000000
2023-10-17 13:44:14,524 epoch 1 - iter 1782/1984 - loss 0.38769661 - time (sec): 80.77 - samples/sec: 1834.44 - lr: 0.000045 - momentum: 0.000000
2023-10-17 13:44:23,569 epoch 1 - iter 1980/1984 - loss 0.36361201 - time (sec): 89.81 - samples/sec: 1822.66 - lr: 0.000050 - momentum: 0.000000
2023-10-17 13:44:23,744 ----------------------------------------------------------------------------------------------------
2023-10-17 13:44:23,744 EPOCH 1 done: loss 0.3635 - lr: 0.000050
2023-10-17 13:44:26,902 DEV : loss 0.1107097640633583 - f1-score (micro avg)  0.7147
2023-10-17 13:44:26,923 saving best model
2023-10-17 13:44:27,399 ----------------------------------------------------------------------------------------------------
2023-10-17 13:44:36,392 epoch 2 - iter 198/1984 - loss 0.12198282 - time (sec): 8.99 - samples/sec: 1816.59 - lr: 0.000049 - momentum: 0.000000
2023-10-17 13:44:45,686 epoch 2 - iter 396/1984 - loss 0.12288858 - time (sec): 18.29 - samples/sec: 1802.37 - lr: 0.000049 - momentum: 0.000000
2023-10-17 13:44:54,485 epoch 2 - iter 594/1984 - loss 0.12158637 - time (sec): 27.08 - samples/sec: 1790.77 - lr: 0.000048 - momentum: 0.000000
2023-10-17 13:45:03,453 epoch 2 - iter 792/1984 - loss 0.12378321 - time (sec): 36.05 - samples/sec: 1793.41 - lr: 0.000048 - momentum: 0.000000
2023-10-17 13:45:12,572 epoch 2 - iter 990/1984 - loss 0.11981922 - time (sec): 45.17 - samples/sec: 1793.52 - lr: 0.000047 - momentum: 0.000000
2023-10-17 13:45:21,725 epoch 2 - iter 1188/1984 - loss 0.11892503 - time (sec): 54.32 - samples/sec: 1795.43 - lr: 0.000047 - momentum: 0.000000
2023-10-17 13:45:30,864 epoch 2 - iter 1386/1984 - loss 0.11995066 - time (sec): 63.46 - samples/sec: 1807.05 - lr: 0.000046 - momentum: 0.000000
2023-10-17 13:45:40,092 epoch 2 - iter 1584/1984 - loss 0.11971087 - time (sec): 72.69 - samples/sec: 1801.71 - lr: 0.000046 - momentum: 0.000000
2023-10-17 13:45:49,162 epoch 2 - iter 1782/1984 - loss 0.11939199 - time (sec): 81.76 - samples/sec: 1797.41 - lr: 0.000045 - momentum: 0.000000
2023-10-17 13:45:58,300 epoch 2 - iter 1980/1984 - loss 0.11867051 - time (sec): 90.90 - samples/sec: 1802.10 - lr: 0.000044 - momentum: 0.000000
2023-10-17 13:45:58,474 ----------------------------------------------------------------------------------------------------
2023-10-17 13:45:58,474 EPOCH 2 done: loss 0.1188 - lr: 0.000044
2023-10-17 13:46:02,349 DEV : loss 0.09292253851890564 - f1-score (micro avg)  0.7461
2023-10-17 13:46:02,372 saving best model
2023-10-17 13:46:02,884 ----------------------------------------------------------------------------------------------------
2023-10-17 13:46:12,137 epoch 3 - iter 198/1984 - loss 0.08093006 - time (sec): 9.25 - samples/sec: 1833.94 - lr: 0.000044 - momentum: 0.000000
2023-10-17 13:46:21,419 epoch 3 - iter 396/1984 - loss 0.08776510 - time (sec): 18.53 - samples/sec: 1800.42 - lr: 0.000043 - momentum: 0.000000
2023-10-17 13:46:30,417 epoch 3 - iter 594/1984 - loss 0.08770434 - time (sec): 27.53 - samples/sec: 1795.44 - lr: 0.000043 - momentum: 0.000000
2023-10-17 13:46:39,552 epoch 3 - iter 792/1984 - loss 0.08713512 - time (sec): 36.67 - samples/sec: 1797.16 - lr: 0.000042 - momentum: 0.000000
2023-10-17 13:46:48,582 epoch 3 - iter 990/1984 - loss 0.08915958 - time (sec): 45.70 - samples/sec: 1808.80 - lr: 0.000042 - momentum: 0.000000
2023-10-17 13:46:57,705 epoch 3 - iter 1188/1984 - loss 0.08975126 - time (sec): 54.82 - samples/sec: 1817.72 - lr: 0.000041 - momentum: 0.000000
2023-10-17 13:47:06,969 epoch 3 - iter 1386/1984 - loss 0.08933047 - time (sec): 64.08 - samples/sec: 1822.52 - lr: 0.000041 - momentum: 0.000000
2023-10-17 13:47:16,293 epoch 3 - iter 1584/1984 - loss 0.08968171 - time (sec): 73.41 - samples/sec: 1814.20 - lr: 0.000040 - momentum: 0.000000
2023-10-17 13:47:25,592 epoch 3 - iter 1782/1984 - loss 0.09022258 - time (sec): 82.71 - samples/sec: 1798.16 - lr: 0.000039 - momentum: 0.000000
2023-10-17 13:47:34,674 epoch 3 - iter 1980/1984 - loss 0.08989172 - time (sec): 91.79 - samples/sec: 1783.16 - lr: 0.000039 - momentum: 0.000000
2023-10-17 13:47:34,856 ----------------------------------------------------------------------------------------------------
2023-10-17 13:47:34,856 EPOCH 3 done: loss 0.0899 - lr: 0.000039
2023-10-17 13:47:38,254 DEV : loss 0.11529310792684555 - f1-score (micro avg)  0.7554
2023-10-17 13:47:38,275 saving best model
2023-10-17 13:47:38,842 ----------------------------------------------------------------------------------------------------
2023-10-17 13:47:47,509 epoch 4 - iter 198/1984 - loss 0.05901225 - time (sec): 8.66 - samples/sec: 1941.15 - lr: 0.000038 - momentum: 0.000000
2023-10-17 13:47:56,529 epoch 4 - iter 396/1984 - loss 0.07172775 - time (sec): 17.68 - samples/sec: 1856.82 - lr: 0.000038 - momentum: 0.000000
2023-10-17 13:48:05,245 epoch 4 - iter 594/1984 - loss 0.06965845 - time (sec): 26.40 - samples/sec: 1845.67 - lr: 0.000037 - momentum: 0.000000
2023-10-17 13:48:13,933 epoch 4 - iter 792/1984 - loss 0.07214150 - time (sec): 35.09 - samples/sec: 1873.43 - lr: 0.000037 - momentum: 0.000000
2023-10-17 13:48:22,554 epoch 4 - iter 990/1984 - loss 0.07156799 - time (sec): 43.71 - samples/sec: 1878.28 - lr: 0.000036 - momentum: 0.000000
2023-10-17 13:48:31,704 epoch 4 - iter 1188/1984 - loss 0.07472710 - time (sec): 52.86 - samples/sec: 1874.11 - lr: 0.000036 - momentum: 0.000000
2023-10-17 13:48:40,821 epoch 4 - iter 1386/1984 - loss 0.07315463 - time (sec): 61.97 - samples/sec: 1855.64 - lr: 0.000035 - momentum: 0.000000
2023-10-17 13:48:49,969 epoch 4 - iter 1584/1984 - loss 0.07446253 - time (sec): 71.12 - samples/sec: 1846.18 - lr: 0.000034 - momentum: 0.000000
2023-10-17 13:48:59,484 epoch 4 - iter 1782/1984 - loss 0.07318315 - time (sec): 80.64 - samples/sec: 1826.73 - lr: 0.000034 - momentum: 0.000000
2023-10-17 13:49:09,059 epoch 4 - iter 1980/1984 - loss 0.07161599 - time (sec): 90.21 - samples/sec: 1815.30 - lr: 0.000033 - momentum: 0.000000
2023-10-17 13:49:09,239 ----------------------------------------------------------------------------------------------------
2023-10-17 13:49:09,239 EPOCH 4 done: loss 0.0716 - lr: 0.000033
2023-10-17 13:49:12,748 DEV : loss 0.16965167224407196 - f1-score (micro avg)  0.7562
2023-10-17 13:49:12,770 saving best model
2023-10-17 13:49:13,358 ----------------------------------------------------------------------------------------------------
2023-10-17 13:49:22,504 epoch 5 - iter 198/1984 - loss 0.05142115 - time (sec): 9.14 - samples/sec: 1744.82 - lr: 0.000033 - momentum: 0.000000
2023-10-17 13:49:31,639 epoch 5 - iter 396/1984 - loss 0.05326253 - time (sec): 18.28 - samples/sec: 1783.51 - lr: 0.000032 - momentum: 0.000000
2023-10-17 13:49:40,726 epoch 5 - iter 594/1984 - loss 0.05234995 - time (sec): 27.36 - samples/sec: 1766.40 - lr: 0.000032 - momentum: 0.000000
2023-10-17 13:49:49,869 epoch 5 - iter 792/1984 - loss 0.05493024 - time (sec): 36.51 - samples/sec: 1766.69 - lr: 0.000031 - momentum: 0.000000
2023-10-17 13:49:59,222 epoch 5 - iter 990/1984 - loss 0.05547255 - time (sec): 45.86 - samples/sec: 1768.21 - lr: 0.000031 - momentum: 0.000000
2023-10-17 13:50:08,703 epoch 5 - iter 1188/1984 - loss 0.05535415 - time (sec): 55.34 - samples/sec: 1775.28 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:50:17,849 epoch 5 - iter 1386/1984 - loss 0.05553716 - time (sec): 64.49 - samples/sec: 1780.41 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:50:27,041 epoch 5 - iter 1584/1984 - loss 0.05438664 - time (sec): 73.68 - samples/sec: 1785.59 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:50:36,058 epoch 5 - iter 1782/1984 - loss 0.05420342 - time (sec): 82.70 - samples/sec: 1785.78 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:50:45,233 epoch 5 - iter 1980/1984 - loss 0.05491134 - time (sec): 91.87 - samples/sec: 1780.76 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:50:45,422 ----------------------------------------------------------------------------------------------------
2023-10-17 13:50:45,423 EPOCH 5 done: loss 0.0548 - lr: 0.000028
2023-10-17 13:50:48,854 DEV : loss 0.17186634242534637 - f1-score (micro avg)  0.7583
2023-10-17 13:50:48,875 saving best model
2023-10-17 13:50:49,394 ----------------------------------------------------------------------------------------------------
2023-10-17 13:50:58,709 epoch 6 - iter 198/1984 - loss 0.04210687 - time (sec): 9.31 - samples/sec: 1797.63 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:51:07,945 epoch 6 - iter 396/1984 - loss 0.04558327 - time (sec): 18.55 - samples/sec: 1786.69 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:51:17,296 epoch 6 - iter 594/1984 - loss 0.04411621 - time (sec): 27.90 - samples/sec: 1776.05 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:51:26,833 epoch 6 - iter 792/1984 - loss 0.04189826 - time (sec): 37.44 - samples/sec: 1766.81 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:51:35,758 epoch 6 - iter 990/1984 - loss 0.04186020 - time (sec): 46.36 - samples/sec: 1790.47 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:51:44,892 epoch 6 - iter 1188/1984 - loss 0.04179555 - time (sec): 55.50 - samples/sec: 1784.39 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:51:54,014 epoch 6 - iter 1386/1984 - loss 0.04062803 - time (sec): 64.62 - samples/sec: 1802.35 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:52:03,187 epoch 6 - iter 1584/1984 - loss 0.04047753 - time (sec): 73.79 - samples/sec: 1794.79 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:52:12,464 epoch 6 - iter 1782/1984 - loss 0.04096866 - time (sec): 83.07 - samples/sec: 1783.58 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:52:21,511 epoch 6 - iter 1980/1984 - loss 0.04137447 - time (sec): 92.11 - samples/sec: 1776.94 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:52:21,692 ----------------------------------------------------------------------------------------------------
2023-10-17 13:52:21,692 EPOCH 6 done: loss 0.0414 - lr: 0.000022
2023-10-17 13:52:25,752 DEV : loss 0.1979617029428482 - f1-score (micro avg)  0.7635
2023-10-17 13:52:25,775 saving best model
2023-10-17 13:52:26,295 ----------------------------------------------------------------------------------------------------
2023-10-17 13:52:35,522 epoch 7 - iter 198/1984 - loss 0.02932924 - time (sec): 9.22 - samples/sec: 1697.08 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:52:45,099 epoch 7 - iter 396/1984 - loss 0.02557373 - time (sec): 18.80 - samples/sec: 1741.96 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:52:54,288 epoch 7 - iter 594/1984 - loss 0.02664735 - time (sec): 27.99 - samples/sec: 1753.83 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:53:03,396 epoch 7 - iter 792/1984 - loss 0.02794794 - time (sec): 37.10 - samples/sec: 1758.17 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:53:12,526 epoch 7 - iter 990/1984 - loss 0.02878967 - time (sec): 46.23 - samples/sec: 1767.18 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:53:21,604 epoch 7 - iter 1188/1984 - loss 0.02864416 - time (sec): 55.30 - samples/sec: 1761.33 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:53:30,974 epoch 7 - iter 1386/1984 - loss 0.02877632 - time (sec): 64.67 - samples/sec: 1751.38 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:53:40,189 epoch 7 - iter 1584/1984 - loss 0.02875581 - time (sec): 73.89 - samples/sec: 1754.20 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:53:49,370 epoch 7 - iter 1782/1984 - loss 0.02874451 - time (sec): 83.07 - samples/sec: 1755.94 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:53:58,828 epoch 7 - iter 1980/1984 - loss 0.02854042 - time (sec): 92.53 - samples/sec: 1768.83 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:53:59,013 ----------------------------------------------------------------------------------------------------
2023-10-17 13:53:59,014 EPOCH 7 done: loss 0.0285 - lr: 0.000017
2023-10-17 13:54:02,405 DEV : loss 0.21675720810890198 - f1-score (micro avg)  0.756
2023-10-17 13:54:02,426 ----------------------------------------------------------------------------------------------------
2023-10-17 13:54:11,532 epoch 8 - iter 198/1984 - loss 0.02131513 - time (sec): 9.10 - samples/sec: 1755.79 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:54:20,688 epoch 8 - iter 396/1984 - loss 0.02216696 - time (sec): 18.26 - samples/sec: 1771.05 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:54:29,689 epoch 8 - iter 594/1984 - loss 0.02326467 - time (sec): 27.26 - samples/sec: 1759.32 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:54:38,755 epoch 8 - iter 792/1984 - loss 0.02241368 - time (sec): 36.33 - samples/sec: 1763.98 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:54:47,967 epoch 8 - iter 990/1984 - loss 0.02005196 - time (sec): 45.54 - samples/sec: 1771.30 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:54:57,291 epoch 8 - iter 1188/1984 - loss 0.01961340 - time (sec): 54.86 - samples/sec: 1775.08 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:55:06,861 epoch 8 - iter 1386/1984 - loss 0.02145710 - time (sec): 64.43 - samples/sec: 1753.37 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:55:16,241 epoch 8 - iter 1584/1984 - loss 0.02064329 - time (sec): 73.81 - samples/sec: 1762.80 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:55:25,486 epoch 8 - iter 1782/1984 - loss 0.01981026 - time (sec): 83.06 - samples/sec: 1766.73 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:55:34,752 epoch 8 - iter 1980/1984 - loss 0.02031700 - time (sec): 92.32 - samples/sec: 1772.93 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:55:34,925 ----------------------------------------------------------------------------------------------------
2023-10-17 13:55:34,925 EPOCH 8 done: loss 0.0203 - lr: 0.000011
2023-10-17 13:55:38,336 DEV : loss 0.22816428542137146 - f1-score (micro avg)  0.7689
2023-10-17 13:55:38,358 saving best model
2023-10-17 13:55:38,938 ----------------------------------------------------------------------------------------------------
2023-10-17 13:55:47,982 epoch 9 - iter 198/1984 - loss 0.01192882 - time (sec): 9.04 - samples/sec: 1734.20 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:55:56,918 epoch 9 - iter 396/1984 - loss 0.01286841 - time (sec): 17.97 - samples/sec: 1798.99 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:56:05,849 epoch 9 - iter 594/1984 - loss 0.01438189 - time (sec): 26.91 - samples/sec: 1772.52 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:56:14,942 epoch 9 - iter 792/1984 - loss 0.01318238 - time (sec): 36.00 - samples/sec: 1770.21 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:56:24,030 epoch 9 - iter 990/1984 - loss 0.01351039 - time (sec): 45.09 - samples/sec: 1778.59 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:56:32,852 epoch 9 - iter 1188/1984 - loss 0.01319278 - time (sec): 53.91 - samples/sec: 1793.12 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:56:41,731 epoch 9 - iter 1386/1984 - loss 0.01309930 - time (sec): 62.79 - samples/sec: 1811.55 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:56:51,161 epoch 9 - iter 1584/1984 - loss 0.01255843 - time (sec): 72.22 - samples/sec: 1810.15 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:57:00,359 epoch 9 - iter 1782/1984 - loss 0.01307612 - time (sec): 81.42 - samples/sec: 1803.41 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:57:09,450 epoch 9 - iter 1980/1984 - loss 0.01305751 - time (sec): 90.51 - samples/sec: 1807.63 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:57:09,640 ----------------------------------------------------------------------------------------------------
2023-10-17 13:57:09,640 EPOCH 9 done: loss 0.0130 - lr: 0.000006
2023-10-17 13:57:13,058 DEV : loss 0.23943665623664856 - f1-score (micro avg)  0.7711
2023-10-17 13:57:13,081 saving best model
2023-10-17 13:57:13,702 ----------------------------------------------------------------------------------------------------
2023-10-17 13:57:22,874 epoch 10 - iter 198/1984 - loss 0.00790504 - time (sec): 9.17 - samples/sec: 1775.44 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:57:32,117 epoch 10 - iter 396/1984 - loss 0.00810567 - time (sec): 18.41 - samples/sec: 1819.26 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:57:41,199 epoch 10 - iter 594/1984 - loss 0.00780535 - time (sec): 27.49 - samples/sec: 1803.46 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:57:50,230 epoch 10 - iter 792/1984 - loss 0.00784589 - time (sec): 36.53 - samples/sec: 1788.64 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:57:59,203 epoch 10 - iter 990/1984 - loss 0.00744212 - time (sec): 45.50 - samples/sec: 1799.38 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:58:08,284 epoch 10 - iter 1188/1984 - loss 0.00828425 - time (sec): 54.58 - samples/sec: 1802.08 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:58:17,328 epoch 10 - iter 1386/1984 - loss 0.00823271 - time (sec): 63.62 - samples/sec: 1805.08 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:58:26,441 epoch 10 - iter 1584/1984 - loss 0.00807062 - time (sec): 72.74 - samples/sec: 1805.28 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:58:35,408 epoch 10 - iter 1782/1984 - loss 0.00842240 - time (sec): 81.70 - samples/sec: 1803.89 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:58:44,477 epoch 10 - iter 1980/1984 - loss 0.00880008 - time (sec): 90.77 - samples/sec: 1803.88 - lr: 0.000000 - momentum: 0.000000
2023-10-17 13:58:44,651 ----------------------------------------------------------------------------------------------------
2023-10-17 13:58:44,651 EPOCH 10 done: loss 0.0088 - lr: 0.000000
2023-10-17 13:58:48,192 DEV : loss 0.2471495419740677 - f1-score (micro avg)  0.7779
2023-10-17 13:58:48,221 saving best model
2023-10-17 13:58:49,148 ----------------------------------------------------------------------------------------------------
2023-10-17 13:58:49,150 Loading model from best epoch ...
2023-10-17 13:58:51,929 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 13:58:54,795 
Results:
- F-score (micro) 0.7672
- F-score (macro) 0.6813
- Accuracy 0.653

By class:
              precision    recall  f1-score   support

         LOC     0.8318    0.8382    0.8350       655
         PER     0.6811    0.7758    0.7254       223
         ORG     0.5043    0.4646    0.4836       127

   micro avg     0.7575    0.7771    0.7672      1005
   macro avg     0.6724    0.6928    0.6813      1005
weighted avg     0.7570    0.7771    0.7663      1005

2023-10-17 13:58:54,795 ----------------------------------------------------------------------------------------------------