File size: 23,770 Bytes
35dd799
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
2024-03-26 10:12:07,550 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(31103, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=17, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 Train:  758 sentences
2024-03-26 10:12:07,551         (train_with_dev=False, train_with_test=False)
2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 Training Params:
2024-03-26 10:12:07,551  - learning_rate: "5e-05" 
2024-03-26 10:12:07,551  - mini_batch_size: "8"
2024-03-26 10:12:07,551  - max_epochs: "10"
2024-03-26 10:12:07,551  - shuffle: "True"
2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 Plugins:
2024-03-26 10:12:07,551  - TensorboardLogger
2024-03-26 10:12:07,551  - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 10:12:07,551  - metric: "('micro avg', 'f1-score')"
2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 Computation:
2024-03-26 10:12:07,551  - compute on device: cuda:0
2024-03-26 10:12:07,551  - embedding storage: none
2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 Model training base path: "flair-co-funer-gbert_base-bs8-e10-lr5e-05-3"
2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:07,551 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 10:12:08,913 epoch 1 - iter 9/95 - loss 3.30878654 - time (sec): 1.36 - samples/sec: 2342.42 - lr: 0.000004 - momentum: 0.000000
2024-03-26 10:12:10,727 epoch 1 - iter 18/95 - loss 3.10425395 - time (sec): 3.18 - samples/sec: 1988.82 - lr: 0.000009 - momentum: 0.000000
2024-03-26 10:12:12,640 epoch 1 - iter 27/95 - loss 2.80107707 - time (sec): 5.09 - samples/sec: 1941.49 - lr: 0.000014 - momentum: 0.000000
2024-03-26 10:12:14,009 epoch 1 - iter 36/95 - loss 2.56832707 - time (sec): 6.46 - samples/sec: 1959.16 - lr: 0.000018 - momentum: 0.000000
2024-03-26 10:12:15,906 epoch 1 - iter 45/95 - loss 2.38441941 - time (sec): 8.35 - samples/sec: 1941.76 - lr: 0.000023 - momentum: 0.000000
2024-03-26 10:12:17,254 epoch 1 - iter 54/95 - loss 2.23503114 - time (sec): 9.70 - samples/sec: 1967.22 - lr: 0.000028 - momentum: 0.000000
2024-03-26 10:12:18,501 epoch 1 - iter 63/95 - loss 2.10127196 - time (sec): 10.95 - samples/sec: 1994.22 - lr: 0.000033 - momentum: 0.000000
2024-03-26 10:12:20,434 epoch 1 - iter 72/95 - loss 1.93262394 - time (sec): 12.88 - samples/sec: 1980.87 - lr: 0.000037 - momentum: 0.000000
2024-03-26 10:12:22,397 epoch 1 - iter 81/95 - loss 1.77367380 - time (sec): 14.85 - samples/sec: 1966.51 - lr: 0.000042 - momentum: 0.000000
2024-03-26 10:12:23,920 epoch 1 - iter 90/95 - loss 1.66263281 - time (sec): 16.37 - samples/sec: 1984.51 - lr: 0.000047 - momentum: 0.000000
2024-03-26 10:12:24,965 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:24,965 EPOCH 1 done: loss 1.5923 - lr: 0.000047
2024-03-26 10:12:25,850 DEV : loss 0.4564521014690399 - f1-score (micro avg)  0.6914
2024-03-26 10:12:25,851 saving best model
2024-03-26 10:12:26,114 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:27,479 epoch 2 - iter 9/95 - loss 0.52426643 - time (sec): 1.36 - samples/sec: 2007.72 - lr: 0.000050 - momentum: 0.000000
2024-03-26 10:12:29,308 epoch 2 - iter 18/95 - loss 0.41660600 - time (sec): 3.19 - samples/sec: 1913.27 - lr: 0.000049 - momentum: 0.000000
2024-03-26 10:12:30,481 epoch 2 - iter 27/95 - loss 0.40217596 - time (sec): 4.37 - samples/sec: 1965.47 - lr: 0.000048 - momentum: 0.000000
2024-03-26 10:12:32,723 epoch 2 - iter 36/95 - loss 0.38722517 - time (sec): 6.61 - samples/sec: 1918.56 - lr: 0.000048 - momentum: 0.000000
2024-03-26 10:12:34,656 epoch 2 - iter 45/95 - loss 0.37676757 - time (sec): 8.54 - samples/sec: 1925.24 - lr: 0.000047 - momentum: 0.000000
2024-03-26 10:12:36,809 epoch 2 - iter 54/95 - loss 0.36567606 - time (sec): 10.69 - samples/sec: 1897.06 - lr: 0.000047 - momentum: 0.000000
2024-03-26 10:12:38,805 epoch 2 - iter 63/95 - loss 0.35177076 - time (sec): 12.69 - samples/sec: 1852.39 - lr: 0.000046 - momentum: 0.000000
2024-03-26 10:12:40,311 epoch 2 - iter 72/95 - loss 0.35231443 - time (sec): 14.20 - samples/sec: 1860.87 - lr: 0.000046 - momentum: 0.000000
2024-03-26 10:12:41,751 epoch 2 - iter 81/95 - loss 0.35619563 - time (sec): 15.64 - samples/sec: 1883.95 - lr: 0.000045 - momentum: 0.000000
2024-03-26 10:12:43,960 epoch 2 - iter 90/95 - loss 0.34268426 - time (sec): 17.85 - samples/sec: 1858.76 - lr: 0.000045 - momentum: 0.000000
2024-03-26 10:12:44,600 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:44,600 EPOCH 2 done: loss 0.3389 - lr: 0.000045
2024-03-26 10:12:45,487 DEV : loss 0.2517702877521515 - f1-score (micro avg)  0.8598
2024-03-26 10:12:45,488 saving best model
2024-03-26 10:12:45,916 ----------------------------------------------------------------------------------------------------
2024-03-26 10:12:47,544 epoch 3 - iter 9/95 - loss 0.18626350 - time (sec): 1.63 - samples/sec: 1836.16 - lr: 0.000044 - momentum: 0.000000
2024-03-26 10:12:49,324 epoch 3 - iter 18/95 - loss 0.16963537 - time (sec): 3.41 - samples/sec: 1858.50 - lr: 0.000043 - momentum: 0.000000
2024-03-26 10:12:50,515 epoch 3 - iter 27/95 - loss 0.18166954 - time (sec): 4.60 - samples/sec: 2032.19 - lr: 0.000043 - momentum: 0.000000
2024-03-26 10:12:52,065 epoch 3 - iter 36/95 - loss 0.17639533 - time (sec): 6.15 - samples/sec: 2020.76 - lr: 0.000042 - momentum: 0.000000
2024-03-26 10:12:53,470 epoch 3 - iter 45/95 - loss 0.17991451 - time (sec): 7.55 - samples/sec: 2029.39 - lr: 0.000042 - momentum: 0.000000
2024-03-26 10:12:55,457 epoch 3 - iter 54/95 - loss 0.17836259 - time (sec): 9.54 - samples/sec: 1981.19 - lr: 0.000041 - momentum: 0.000000
2024-03-26 10:12:57,451 epoch 3 - iter 63/95 - loss 0.17801210 - time (sec): 11.53 - samples/sec: 1930.20 - lr: 0.000041 - momentum: 0.000000
2024-03-26 10:12:59,290 epoch 3 - iter 72/95 - loss 0.17908359 - time (sec): 13.37 - samples/sec: 1912.74 - lr: 0.000040 - momentum: 0.000000
2024-03-26 10:13:01,293 epoch 3 - iter 81/95 - loss 0.17473687 - time (sec): 15.38 - samples/sec: 1885.59 - lr: 0.000040 - momentum: 0.000000
2024-03-26 10:13:03,243 epoch 3 - iter 90/95 - loss 0.18331440 - time (sec): 17.33 - samples/sec: 1887.20 - lr: 0.000039 - momentum: 0.000000
2024-03-26 10:13:04,335 ----------------------------------------------------------------------------------------------------
2024-03-26 10:13:04,336 EPOCH 3 done: loss 0.1780 - lr: 0.000039
2024-03-26 10:13:05,235 DEV : loss 0.21343179047107697 - f1-score (micro avg)  0.8682
2024-03-26 10:13:05,236 saving best model
2024-03-26 10:13:05,669 ----------------------------------------------------------------------------------------------------
2024-03-26 10:13:06,950 epoch 4 - iter 9/95 - loss 0.12934404 - time (sec): 1.28 - samples/sec: 2171.15 - lr: 0.000039 - momentum: 0.000000
2024-03-26 10:13:08,782 epoch 4 - iter 18/95 - loss 0.11598588 - time (sec): 3.11 - samples/sec: 1975.63 - lr: 0.000038 - momentum: 0.000000
2024-03-26 10:13:10,695 epoch 4 - iter 27/95 - loss 0.11184561 - time (sec): 5.02 - samples/sec: 1922.31 - lr: 0.000037 - momentum: 0.000000
2024-03-26 10:13:12,169 epoch 4 - iter 36/95 - loss 0.10779297 - time (sec): 6.50 - samples/sec: 1930.71 - lr: 0.000037 - momentum: 0.000000
2024-03-26 10:13:14,583 epoch 4 - iter 45/95 - loss 0.10822640 - time (sec): 8.91 - samples/sec: 1854.16 - lr: 0.000036 - momentum: 0.000000
2024-03-26 10:13:16,423 epoch 4 - iter 54/95 - loss 0.10627905 - time (sec): 10.75 - samples/sec: 1835.61 - lr: 0.000036 - momentum: 0.000000
2024-03-26 10:13:18,347 epoch 4 - iter 63/95 - loss 0.10563130 - time (sec): 12.68 - samples/sec: 1814.36 - lr: 0.000035 - momentum: 0.000000
2024-03-26 10:13:20,216 epoch 4 - iter 72/95 - loss 0.11061803 - time (sec): 14.54 - samples/sec: 1829.28 - lr: 0.000035 - momentum: 0.000000
2024-03-26 10:13:22,230 epoch 4 - iter 81/95 - loss 0.11629637 - time (sec): 16.56 - samples/sec: 1825.85 - lr: 0.000034 - momentum: 0.000000
2024-03-26 10:13:23,199 epoch 4 - iter 90/95 - loss 0.11624132 - time (sec): 17.53 - samples/sec: 1865.91 - lr: 0.000034 - momentum: 0.000000
2024-03-26 10:13:24,226 ----------------------------------------------------------------------------------------------------
2024-03-26 10:13:24,226 EPOCH 4 done: loss 0.1162 - lr: 0.000034
2024-03-26 10:13:25,126 DEV : loss 0.17539489269256592 - f1-score (micro avg)  0.9069
2024-03-26 10:13:25,128 saving best model
2024-03-26 10:13:25,559 ----------------------------------------------------------------------------------------------------
2024-03-26 10:13:27,443 epoch 5 - iter 9/95 - loss 0.07744264 - time (sec): 1.88 - samples/sec: 1825.68 - lr: 0.000033 - momentum: 0.000000
2024-03-26 10:13:28,876 epoch 5 - iter 18/95 - loss 0.07305274 - time (sec): 3.32 - samples/sec: 1887.57 - lr: 0.000032 - momentum: 0.000000
2024-03-26 10:13:30,235 epoch 5 - iter 27/95 - loss 0.08372766 - time (sec): 4.67 - samples/sec: 1931.87 - lr: 0.000032 - momentum: 0.000000
2024-03-26 10:13:32,114 epoch 5 - iter 36/95 - loss 0.08762107 - time (sec): 6.55 - samples/sec: 1872.88 - lr: 0.000031 - momentum: 0.000000
2024-03-26 10:13:34,321 epoch 5 - iter 45/95 - loss 0.08538004 - time (sec): 8.76 - samples/sec: 1858.35 - lr: 0.000031 - momentum: 0.000000
2024-03-26 10:13:36,754 epoch 5 - iter 54/95 - loss 0.08449088 - time (sec): 11.19 - samples/sec: 1814.50 - lr: 0.000030 - momentum: 0.000000
2024-03-26 10:13:38,420 epoch 5 - iter 63/95 - loss 0.08194794 - time (sec): 12.86 - samples/sec: 1806.64 - lr: 0.000030 - momentum: 0.000000
2024-03-26 10:13:40,188 epoch 5 - iter 72/95 - loss 0.08370354 - time (sec): 14.63 - samples/sec: 1808.03 - lr: 0.000029 - momentum: 0.000000
2024-03-26 10:13:42,386 epoch 5 - iter 81/95 - loss 0.08624660 - time (sec): 16.83 - samples/sec: 1793.37 - lr: 0.000029 - momentum: 0.000000
2024-03-26 10:13:43,778 epoch 5 - iter 90/95 - loss 0.08725992 - time (sec): 18.22 - samples/sec: 1809.23 - lr: 0.000028 - momentum: 0.000000
2024-03-26 10:13:44,549 ----------------------------------------------------------------------------------------------------
2024-03-26 10:13:44,549 EPOCH 5 done: loss 0.0853 - lr: 0.000028
2024-03-26 10:13:45,528 DEV : loss 0.15603235363960266 - f1-score (micro avg)  0.9191
2024-03-26 10:13:45,529 saving best model
2024-03-26 10:13:45,955 ----------------------------------------------------------------------------------------------------
2024-03-26 10:13:47,889 epoch 6 - iter 9/95 - loss 0.05189760 - time (sec): 1.93 - samples/sec: 1805.09 - lr: 0.000027 - momentum: 0.000000
2024-03-26 10:13:49,440 epoch 6 - iter 18/95 - loss 0.05290575 - time (sec): 3.48 - samples/sec: 1826.05 - lr: 0.000027 - momentum: 0.000000
2024-03-26 10:13:51,348 epoch 6 - iter 27/95 - loss 0.05365903 - time (sec): 5.39 - samples/sec: 1834.17 - lr: 0.000026 - momentum: 0.000000
2024-03-26 10:13:52,910 epoch 6 - iter 36/95 - loss 0.05604796 - time (sec): 6.95 - samples/sec: 1834.10 - lr: 0.000026 - momentum: 0.000000
2024-03-26 10:13:54,366 epoch 6 - iter 45/95 - loss 0.05515355 - time (sec): 8.41 - samples/sec: 1870.09 - lr: 0.000025 - momentum: 0.000000
2024-03-26 10:13:55,813 epoch 6 - iter 54/95 - loss 0.05376542 - time (sec): 9.86 - samples/sec: 1868.77 - lr: 0.000025 - momentum: 0.000000
2024-03-26 10:13:57,095 epoch 6 - iter 63/95 - loss 0.05273952 - time (sec): 11.14 - samples/sec: 1930.27 - lr: 0.000024 - momentum: 0.000000
2024-03-26 10:13:59,339 epoch 6 - iter 72/95 - loss 0.06092265 - time (sec): 13.38 - samples/sec: 1897.59 - lr: 0.000024 - momentum: 0.000000
2024-03-26 10:14:00,927 epoch 6 - iter 81/95 - loss 0.05912352 - time (sec): 14.97 - samples/sec: 1914.17 - lr: 0.000023 - momentum: 0.000000
2024-03-26 10:14:02,632 epoch 6 - iter 90/95 - loss 0.06035882 - time (sec): 16.68 - samples/sec: 1932.85 - lr: 0.000023 - momentum: 0.000000
2024-03-26 10:14:03,897 ----------------------------------------------------------------------------------------------------
2024-03-26 10:14:03,897 EPOCH 6 done: loss 0.0603 - lr: 0.000023
2024-03-26 10:14:04,793 DEV : loss 0.16718925535678864 - f1-score (micro avg)  0.9201
2024-03-26 10:14:04,794 saving best model
2024-03-26 10:14:05,238 ----------------------------------------------------------------------------------------------------
2024-03-26 10:14:07,130 epoch 7 - iter 9/95 - loss 0.05321789 - time (sec): 1.89 - samples/sec: 1679.94 - lr: 0.000022 - momentum: 0.000000
2024-03-26 10:14:09,176 epoch 7 - iter 18/95 - loss 0.03684477 - time (sec): 3.94 - samples/sec: 1665.45 - lr: 0.000021 - momentum: 0.000000
2024-03-26 10:14:10,709 epoch 7 - iter 27/95 - loss 0.03258736 - time (sec): 5.47 - samples/sec: 1788.71 - lr: 0.000021 - momentum: 0.000000
2024-03-26 10:14:12,654 epoch 7 - iter 36/95 - loss 0.03305511 - time (sec): 7.41 - samples/sec: 1777.21 - lr: 0.000020 - momentum: 0.000000
2024-03-26 10:14:15,035 epoch 7 - iter 45/95 - loss 0.03992535 - time (sec): 9.80 - samples/sec: 1770.36 - lr: 0.000020 - momentum: 0.000000
2024-03-26 10:14:16,548 epoch 7 - iter 54/95 - loss 0.03978942 - time (sec): 11.31 - samples/sec: 1778.83 - lr: 0.000019 - momentum: 0.000000
2024-03-26 10:14:18,735 epoch 7 - iter 63/95 - loss 0.04213624 - time (sec): 13.49 - samples/sec: 1784.59 - lr: 0.000019 - momentum: 0.000000
2024-03-26 10:14:20,525 epoch 7 - iter 72/95 - loss 0.04587297 - time (sec): 15.29 - samples/sec: 1790.89 - lr: 0.000018 - momentum: 0.000000
2024-03-26 10:14:21,947 epoch 7 - iter 81/95 - loss 0.04361343 - time (sec): 16.71 - samples/sec: 1802.97 - lr: 0.000018 - momentum: 0.000000
2024-03-26 10:14:23,915 epoch 7 - iter 90/95 - loss 0.04536795 - time (sec): 18.68 - samples/sec: 1783.65 - lr: 0.000017 - momentum: 0.000000
2024-03-26 10:14:24,399 ----------------------------------------------------------------------------------------------------
2024-03-26 10:14:24,399 EPOCH 7 done: loss 0.0462 - lr: 0.000017
2024-03-26 10:14:25,307 DEV : loss 0.16716967523097992 - f1-score (micro avg)  0.9411
2024-03-26 10:14:25,308 saving best model
2024-03-26 10:14:25,747 ----------------------------------------------------------------------------------------------------
2024-03-26 10:14:27,627 epoch 8 - iter 9/95 - loss 0.01951604 - time (sec): 1.88 - samples/sec: 1708.16 - lr: 0.000016 - momentum: 0.000000
2024-03-26 10:14:30,113 epoch 8 - iter 18/95 - loss 0.01840664 - time (sec): 4.36 - samples/sec: 1697.69 - lr: 0.000016 - momentum: 0.000000
2024-03-26 10:14:31,879 epoch 8 - iter 27/95 - loss 0.02281365 - time (sec): 6.13 - samples/sec: 1736.13 - lr: 0.000015 - momentum: 0.000000
2024-03-26 10:14:33,422 epoch 8 - iter 36/95 - loss 0.02354290 - time (sec): 7.67 - samples/sec: 1728.02 - lr: 0.000015 - momentum: 0.000000
2024-03-26 10:14:34,930 epoch 8 - iter 45/95 - loss 0.02180613 - time (sec): 9.18 - samples/sec: 1759.24 - lr: 0.000014 - momentum: 0.000000
2024-03-26 10:14:36,592 epoch 8 - iter 54/95 - loss 0.02226186 - time (sec): 10.84 - samples/sec: 1779.13 - lr: 0.000014 - momentum: 0.000000
2024-03-26 10:14:38,783 epoch 8 - iter 63/95 - loss 0.03092783 - time (sec): 13.03 - samples/sec: 1775.62 - lr: 0.000013 - momentum: 0.000000
2024-03-26 10:14:41,032 epoch 8 - iter 72/95 - loss 0.03257996 - time (sec): 15.28 - samples/sec: 1756.12 - lr: 0.000013 - momentum: 0.000000
2024-03-26 10:14:42,690 epoch 8 - iter 81/95 - loss 0.03612619 - time (sec): 16.94 - samples/sec: 1758.38 - lr: 0.000012 - momentum: 0.000000
2024-03-26 10:14:43,981 epoch 8 - iter 90/95 - loss 0.03615101 - time (sec): 18.23 - samples/sec: 1800.86 - lr: 0.000012 - momentum: 0.000000
2024-03-26 10:14:44,878 ----------------------------------------------------------------------------------------------------
2024-03-26 10:14:44,878 EPOCH 8 done: loss 0.0350 - lr: 0.000012
2024-03-26 10:14:45,778 DEV : loss 0.1608782857656479 - f1-score (micro avg)  0.9517
2024-03-26 10:14:45,779 saving best model
2024-03-26 10:14:46,223 ----------------------------------------------------------------------------------------------------
2024-03-26 10:14:48,197 epoch 9 - iter 9/95 - loss 0.01453760 - time (sec): 1.97 - samples/sec: 1787.94 - lr: 0.000011 - momentum: 0.000000
2024-03-26 10:14:49,915 epoch 9 - iter 18/95 - loss 0.02446034 - time (sec): 3.69 - samples/sec: 1813.90 - lr: 0.000010 - momentum: 0.000000
2024-03-26 10:14:51,796 epoch 9 - iter 27/95 - loss 0.02473284 - time (sec): 5.57 - samples/sec: 1834.03 - lr: 0.000010 - momentum: 0.000000
2024-03-26 10:14:53,651 epoch 9 - iter 36/95 - loss 0.02318813 - time (sec): 7.43 - samples/sec: 1828.97 - lr: 0.000009 - momentum: 0.000000
2024-03-26 10:14:55,908 epoch 9 - iter 45/95 - loss 0.02045251 - time (sec): 9.68 - samples/sec: 1750.46 - lr: 0.000009 - momentum: 0.000000
2024-03-26 10:14:57,835 epoch 9 - iter 54/95 - loss 0.02611207 - time (sec): 11.61 - samples/sec: 1738.74 - lr: 0.000008 - momentum: 0.000000
2024-03-26 10:14:59,723 epoch 9 - iter 63/95 - loss 0.02537917 - time (sec): 13.50 - samples/sec: 1749.31 - lr: 0.000008 - momentum: 0.000000
2024-03-26 10:15:01,602 epoch 9 - iter 72/95 - loss 0.02461298 - time (sec): 15.38 - samples/sec: 1752.48 - lr: 0.000007 - momentum: 0.000000
2024-03-26 10:15:02,853 epoch 9 - iter 81/95 - loss 0.02473164 - time (sec): 16.63 - samples/sec: 1775.52 - lr: 0.000007 - momentum: 0.000000
2024-03-26 10:15:04,257 epoch 9 - iter 90/95 - loss 0.02719568 - time (sec): 18.03 - samples/sec: 1797.34 - lr: 0.000006 - momentum: 0.000000
2024-03-26 10:15:05,204 ----------------------------------------------------------------------------------------------------
2024-03-26 10:15:05,204 EPOCH 9 done: loss 0.0265 - lr: 0.000006
2024-03-26 10:15:06,105 DEV : loss 0.18035191297531128 - f1-score (micro avg)  0.9468
2024-03-26 10:15:06,106 ----------------------------------------------------------------------------------------------------
2024-03-26 10:15:08,246 epoch 10 - iter 9/95 - loss 0.00557531 - time (sec): 2.14 - samples/sec: 1781.65 - lr: 0.000005 - momentum: 0.000000
2024-03-26 10:15:09,506 epoch 10 - iter 18/95 - loss 0.00880795 - time (sec): 3.40 - samples/sec: 1904.54 - lr: 0.000005 - momentum: 0.000000
2024-03-26 10:15:10,813 epoch 10 - iter 27/95 - loss 0.02526886 - time (sec): 4.71 - samples/sec: 2014.92 - lr: 0.000004 - momentum: 0.000000
2024-03-26 10:15:12,166 epoch 10 - iter 36/95 - loss 0.02295782 - time (sec): 6.06 - samples/sec: 2035.49 - lr: 0.000004 - momentum: 0.000000
2024-03-26 10:15:14,123 epoch 10 - iter 45/95 - loss 0.01953834 - time (sec): 8.02 - samples/sec: 1986.50 - lr: 0.000003 - momentum: 0.000000
2024-03-26 10:15:15,695 epoch 10 - iter 54/95 - loss 0.01921737 - time (sec): 9.59 - samples/sec: 1977.11 - lr: 0.000003 - momentum: 0.000000
2024-03-26 10:15:18,215 epoch 10 - iter 63/95 - loss 0.02060934 - time (sec): 12.11 - samples/sec: 1898.74 - lr: 0.000002 - momentum: 0.000000
2024-03-26 10:15:19,503 epoch 10 - iter 72/95 - loss 0.01949359 - time (sec): 13.40 - samples/sec: 1903.48 - lr: 0.000002 - momentum: 0.000000
2024-03-26 10:15:21,840 epoch 10 - iter 81/95 - loss 0.01845566 - time (sec): 15.73 - samples/sec: 1851.36 - lr: 0.000001 - momentum: 0.000000
2024-03-26 10:15:24,060 epoch 10 - iter 90/95 - loss 0.02088092 - time (sec): 17.95 - samples/sec: 1831.79 - lr: 0.000001 - momentum: 0.000000
2024-03-26 10:15:25,122 ----------------------------------------------------------------------------------------------------
2024-03-26 10:15:25,122 EPOCH 10 done: loss 0.0206 - lr: 0.000001
2024-03-26 10:15:26,019 DEV : loss 0.18286916613578796 - f1-score (micro avg)  0.9417
2024-03-26 10:15:26,304 ----------------------------------------------------------------------------------------------------
2024-03-26 10:15:26,304 Loading model from best epoch ...
2024-03-26 10:15:27,155 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 10:15:27,901 
Results:
- F-score (micro) 0.9163
- F-score (macro) 0.6959
- Accuracy 0.8504

By class:
              precision    recall  f1-score   support

 Unternehmen     0.9173    0.8759    0.8962       266
 Auslagerung     0.8851    0.9277    0.9059       249
         Ort     0.9708    0.9925    0.9815       134
    Software     0.0000    0.0000    0.0000         0

   micro avg     0.9128    0.9199    0.9163       649
   macro avg     0.6933    0.6990    0.6959       649
weighted avg     0.9160    0.9199    0.9175       649

2024-03-26 10:15:27,901 ----------------------------------------------------------------------------------------------------