File size: 23,780 Bytes
c43ec89
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
2024-03-26 11:41:56,119 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,119 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(30001, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=17, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2024-03-26 11:41:56,119 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,119 Corpus: 758 train + 94 dev + 96 test sentences
2024-03-26 11:41:56,119 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,119 Train:  758 sentences
2024-03-26 11:41:56,119         (train_with_dev=False, train_with_test=False)
2024-03-26 11:41:56,119 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,119 Training Params:
2024-03-26 11:41:56,119  - learning_rate: "5e-05" 
2024-03-26 11:41:56,119  - mini_batch_size: "8"
2024-03-26 11:41:56,119  - max_epochs: "10"
2024-03-26 11:41:56,119  - shuffle: "True"
2024-03-26 11:41:56,119 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,119 Plugins:
2024-03-26 11:41:56,119  - TensorboardLogger
2024-03-26 11:41:56,119  - LinearScheduler | warmup_fraction: '0.1'
2024-03-26 11:41:56,119 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,119 Final evaluation on model from best epoch (best-model.pt)
2024-03-26 11:41:56,119  - metric: "('micro avg', 'f1-score')"
2024-03-26 11:41:56,119 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,119 Computation:
2024-03-26 11:41:56,119  - compute on device: cuda:0
2024-03-26 11:41:56,119  - embedding storage: none
2024-03-26 11:41:56,119 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,120 Model training base path: "flair-co-funer-german_bert_base-bs8-e10-lr5e-05-3"
2024-03-26 11:41:56,120 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,120 ----------------------------------------------------------------------------------------------------
2024-03-26 11:41:56,120 Logging anything other than scalars to TensorBoard is currently not supported.
2024-03-26 11:41:57,551 epoch 1 - iter 9/95 - loss 2.98531268 - time (sec): 1.43 - samples/sec: 2229.09 - lr: 0.000004 - momentum: 0.000000
2024-03-26 11:41:59,510 epoch 1 - iter 18/95 - loss 2.85818160 - time (sec): 3.39 - samples/sec: 1862.53 - lr: 0.000009 - momentum: 0.000000
2024-03-26 11:42:01,481 epoch 1 - iter 27/95 - loss 2.62573431 - time (sec): 5.36 - samples/sec: 1842.70 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:42:02,897 epoch 1 - iter 36/95 - loss 2.42801287 - time (sec): 6.78 - samples/sec: 1866.86 - lr: 0.000018 - momentum: 0.000000
2024-03-26 11:42:04,861 epoch 1 - iter 45/95 - loss 2.26046296 - time (sec): 8.74 - samples/sec: 1855.81 - lr: 0.000023 - momentum: 0.000000
2024-03-26 11:42:06,284 epoch 1 - iter 54/95 - loss 2.12403468 - time (sec): 10.16 - samples/sec: 1877.80 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:42:07,571 epoch 1 - iter 63/95 - loss 2.00022038 - time (sec): 11.45 - samples/sec: 1906.91 - lr: 0.000033 - momentum: 0.000000
2024-03-26 11:42:09,548 epoch 1 - iter 72/95 - loss 1.83421116 - time (sec): 13.43 - samples/sec: 1900.47 - lr: 0.000037 - momentum: 0.000000
2024-03-26 11:42:11,580 epoch 1 - iter 81/95 - loss 1.68618442 - time (sec): 15.46 - samples/sec: 1888.38 - lr: 0.000042 - momentum: 0.000000
2024-03-26 11:42:13,120 epoch 1 - iter 90/95 - loss 1.57871779 - time (sec): 17.00 - samples/sec: 1910.76 - lr: 0.000047 - momentum: 0.000000
2024-03-26 11:42:14,216 ----------------------------------------------------------------------------------------------------
2024-03-26 11:42:14,217 EPOCH 1 done: loss 1.5104 - lr: 0.000047
2024-03-26 11:42:15,178 DEV : loss 0.45259279012680054 - f1-score (micro avg)  0.7077
2024-03-26 11:42:15,181 saving best model
2024-03-26 11:42:15,475 ----------------------------------------------------------------------------------------------------
2024-03-26 11:42:16,898 epoch 2 - iter 9/95 - loss 0.44373371 - time (sec): 1.42 - samples/sec: 1926.19 - lr: 0.000050 - momentum: 0.000000
2024-03-26 11:42:18,795 epoch 2 - iter 18/95 - loss 0.37916467 - time (sec): 3.32 - samples/sec: 1840.39 - lr: 0.000049 - momentum: 0.000000
2024-03-26 11:42:20,007 epoch 2 - iter 27/95 - loss 0.38336252 - time (sec): 4.53 - samples/sec: 1893.64 - lr: 0.000048 - momentum: 0.000000
2024-03-26 11:42:22,319 epoch 2 - iter 36/95 - loss 0.35569118 - time (sec): 6.84 - samples/sec: 1852.74 - lr: 0.000048 - momentum: 0.000000
2024-03-26 11:42:24,292 epoch 2 - iter 45/95 - loss 0.34697390 - time (sec): 8.82 - samples/sec: 1865.19 - lr: 0.000047 - momentum: 0.000000
2024-03-26 11:42:26,552 epoch 2 - iter 54/95 - loss 0.33937698 - time (sec): 11.08 - samples/sec: 1831.59 - lr: 0.000047 - momentum: 0.000000
2024-03-26 11:42:28,617 epoch 2 - iter 63/95 - loss 0.32583403 - time (sec): 13.14 - samples/sec: 1788.68 - lr: 0.000046 - momentum: 0.000000
2024-03-26 11:42:30,187 epoch 2 - iter 72/95 - loss 0.32710942 - time (sec): 14.71 - samples/sec: 1795.62 - lr: 0.000046 - momentum: 0.000000
2024-03-26 11:42:31,664 epoch 2 - iter 81/95 - loss 0.33105206 - time (sec): 16.19 - samples/sec: 1819.71 - lr: 0.000045 - momentum: 0.000000
2024-03-26 11:42:34,031 epoch 2 - iter 90/95 - loss 0.31864283 - time (sec): 18.56 - samples/sec: 1787.63 - lr: 0.000045 - momentum: 0.000000
2024-03-26 11:42:34,678 ----------------------------------------------------------------------------------------------------
2024-03-26 11:42:34,678 EPOCH 2 done: loss 0.3160 - lr: 0.000045
2024-03-26 11:42:35,616 DEV : loss 0.25337713956832886 - f1-score (micro avg)  0.8517
2024-03-26 11:42:35,618 saving best model
2024-03-26 11:42:36,073 ----------------------------------------------------------------------------------------------------
2024-03-26 11:42:37,752 epoch 3 - iter 9/95 - loss 0.16812656 - time (sec): 1.68 - samples/sec: 1780.13 - lr: 0.000044 - momentum: 0.000000
2024-03-26 11:42:39,594 epoch 3 - iter 18/95 - loss 0.15920408 - time (sec): 3.52 - samples/sec: 1798.52 - lr: 0.000043 - momentum: 0.000000
2024-03-26 11:42:40,825 epoch 3 - iter 27/95 - loss 0.16872134 - time (sec): 4.75 - samples/sec: 1966.68 - lr: 0.000043 - momentum: 0.000000
2024-03-26 11:42:42,410 epoch 3 - iter 36/95 - loss 0.16893319 - time (sec): 6.34 - samples/sec: 1960.85 - lr: 0.000042 - momentum: 0.000000
2024-03-26 11:42:43,896 epoch 3 - iter 45/95 - loss 0.17276018 - time (sec): 7.82 - samples/sec: 1959.75 - lr: 0.000042 - momentum: 0.000000
2024-03-26 11:42:45,951 epoch 3 - iter 54/95 - loss 0.16842614 - time (sec): 9.88 - samples/sec: 1913.47 - lr: 0.000041 - momentum: 0.000000
2024-03-26 11:42:48,002 epoch 3 - iter 63/95 - loss 0.16467077 - time (sec): 11.93 - samples/sec: 1866.41 - lr: 0.000041 - momentum: 0.000000
2024-03-26 11:42:49,892 epoch 3 - iter 72/95 - loss 0.16937138 - time (sec): 13.82 - samples/sec: 1851.04 - lr: 0.000040 - momentum: 0.000000
2024-03-26 11:42:51,947 epoch 3 - iter 81/95 - loss 0.16182392 - time (sec): 15.87 - samples/sec: 1826.48 - lr: 0.000040 - momentum: 0.000000
2024-03-26 11:42:53,971 epoch 3 - iter 90/95 - loss 0.16878367 - time (sec): 17.90 - samples/sec: 1826.88 - lr: 0.000039 - momentum: 0.000000
2024-03-26 11:42:55,109 ----------------------------------------------------------------------------------------------------
2024-03-26 11:42:55,109 EPOCH 3 done: loss 0.1651 - lr: 0.000039
2024-03-26 11:42:56,049 DEV : loss 0.20585086941719055 - f1-score (micro avg)  0.8694
2024-03-26 11:42:56,051 saving best model
2024-03-26 11:42:56,491 ----------------------------------------------------------------------------------------------------
2024-03-26 11:42:57,810 epoch 4 - iter 9/95 - loss 0.13194118 - time (sec): 1.32 - samples/sec: 2106.20 - lr: 0.000039 - momentum: 0.000000
2024-03-26 11:42:59,733 epoch 4 - iter 18/95 - loss 0.11635540 - time (sec): 3.24 - samples/sec: 1896.31 - lr: 0.000038 - momentum: 0.000000
2024-03-26 11:43:01,806 epoch 4 - iter 27/95 - loss 0.11616374 - time (sec): 5.31 - samples/sec: 1817.17 - lr: 0.000037 - momentum: 0.000000
2024-03-26 11:43:03,352 epoch 4 - iter 36/95 - loss 0.11100008 - time (sec): 6.86 - samples/sec: 1828.51 - lr: 0.000037 - momentum: 0.000000
2024-03-26 11:43:05,888 epoch 4 - iter 45/95 - loss 0.10535501 - time (sec): 9.40 - samples/sec: 1758.68 - lr: 0.000036 - momentum: 0.000000
2024-03-26 11:43:07,797 epoch 4 - iter 54/95 - loss 0.10029654 - time (sec): 11.30 - samples/sec: 1745.70 - lr: 0.000036 - momentum: 0.000000
2024-03-26 11:43:09,822 epoch 4 - iter 63/95 - loss 0.09898514 - time (sec): 13.33 - samples/sec: 1725.30 - lr: 0.000035 - momentum: 0.000000
2024-03-26 11:43:11,758 epoch 4 - iter 72/95 - loss 0.10453018 - time (sec): 15.27 - samples/sec: 1742.83 - lr: 0.000035 - momentum: 0.000000
2024-03-26 11:43:13,833 epoch 4 - iter 81/95 - loss 0.11020391 - time (sec): 17.34 - samples/sec: 1743.51 - lr: 0.000034 - momentum: 0.000000
2024-03-26 11:43:14,852 epoch 4 - iter 90/95 - loss 0.10933303 - time (sec): 18.36 - samples/sec: 1781.31 - lr: 0.000034 - momentum: 0.000000
2024-03-26 11:43:15,901 ----------------------------------------------------------------------------------------------------
2024-03-26 11:43:15,901 EPOCH 4 done: loss 0.1085 - lr: 0.000034
2024-03-26 11:43:16,845 DEV : loss 0.21658340096473694 - f1-score (micro avg)  0.8922
2024-03-26 11:43:16,846 saving best model
2024-03-26 11:43:17,284 ----------------------------------------------------------------------------------------------------
2024-03-26 11:43:19,178 epoch 5 - iter 9/95 - loss 0.08765859 - time (sec): 1.89 - samples/sec: 1816.78 - lr: 0.000033 - momentum: 0.000000
2024-03-26 11:43:20,627 epoch 5 - iter 18/95 - loss 0.08461158 - time (sec): 3.34 - samples/sec: 1872.82 - lr: 0.000032 - momentum: 0.000000
2024-03-26 11:43:22,000 epoch 5 - iter 27/95 - loss 0.08758849 - time (sec): 4.71 - samples/sec: 1915.74 - lr: 0.000032 - momentum: 0.000000
2024-03-26 11:43:23,943 epoch 5 - iter 36/95 - loss 0.09087793 - time (sec): 6.66 - samples/sec: 1843.47 - lr: 0.000031 - momentum: 0.000000
2024-03-26 11:43:26,240 epoch 5 - iter 45/95 - loss 0.08947032 - time (sec): 8.95 - samples/sec: 1818.21 - lr: 0.000031 - momentum: 0.000000
2024-03-26 11:43:28,756 epoch 5 - iter 54/95 - loss 0.08289016 - time (sec): 11.47 - samples/sec: 1770.76 - lr: 0.000030 - momentum: 0.000000
2024-03-26 11:43:30,466 epoch 5 - iter 63/95 - loss 0.08016941 - time (sec): 13.18 - samples/sec: 1762.63 - lr: 0.000030 - momentum: 0.000000
2024-03-26 11:43:32,299 epoch 5 - iter 72/95 - loss 0.07857856 - time (sec): 15.01 - samples/sec: 1761.48 - lr: 0.000029 - momentum: 0.000000
2024-03-26 11:43:34,629 epoch 5 - iter 81/95 - loss 0.07793911 - time (sec): 17.34 - samples/sec: 1739.81 - lr: 0.000029 - momentum: 0.000000
2024-03-26 11:43:36,053 epoch 5 - iter 90/95 - loss 0.08006927 - time (sec): 18.77 - samples/sec: 1756.16 - lr: 0.000028 - momentum: 0.000000
2024-03-26 11:43:36,853 ----------------------------------------------------------------------------------------------------
2024-03-26 11:43:36,853 EPOCH 5 done: loss 0.0783 - lr: 0.000028
2024-03-26 11:43:37,901 DEV : loss 0.2107645571231842 - f1-score (micro avg)  0.8962
2024-03-26 11:43:37,904 saving best model
2024-03-26 11:43:38,328 ----------------------------------------------------------------------------------------------------
2024-03-26 11:43:40,303 epoch 6 - iter 9/95 - loss 0.06558960 - time (sec): 1.97 - samples/sec: 1767.62 - lr: 0.000027 - momentum: 0.000000
2024-03-26 11:43:41,904 epoch 6 - iter 18/95 - loss 0.05924394 - time (sec): 3.57 - samples/sec: 1779.28 - lr: 0.000027 - momentum: 0.000000
2024-03-26 11:43:43,873 epoch 6 - iter 27/95 - loss 0.05582816 - time (sec): 5.54 - samples/sec: 1783.78 - lr: 0.000026 - momentum: 0.000000
2024-03-26 11:43:45,492 epoch 6 - iter 36/95 - loss 0.05713882 - time (sec): 7.16 - samples/sec: 1780.31 - lr: 0.000026 - momentum: 0.000000
2024-03-26 11:43:46,974 epoch 6 - iter 45/95 - loss 0.05759581 - time (sec): 8.64 - samples/sec: 1819.13 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:43:48,460 epoch 6 - iter 54/95 - loss 0.05489937 - time (sec): 10.13 - samples/sec: 1818.01 - lr: 0.000025 - momentum: 0.000000
2024-03-26 11:43:49,780 epoch 6 - iter 63/95 - loss 0.05343196 - time (sec): 11.45 - samples/sec: 1877.58 - lr: 0.000024 - momentum: 0.000000
2024-03-26 11:43:52,126 epoch 6 - iter 72/95 - loss 0.06130712 - time (sec): 13.80 - samples/sec: 1840.46 - lr: 0.000024 - momentum: 0.000000
2024-03-26 11:43:53,766 epoch 6 - iter 81/95 - loss 0.05884556 - time (sec): 15.44 - samples/sec: 1856.30 - lr: 0.000023 - momentum: 0.000000
2024-03-26 11:43:55,527 epoch 6 - iter 90/95 - loss 0.06110209 - time (sec): 17.20 - samples/sec: 1874.14 - lr: 0.000023 - momentum: 0.000000
2024-03-26 11:43:56,911 ----------------------------------------------------------------------------------------------------
2024-03-26 11:43:56,911 EPOCH 6 done: loss 0.0620 - lr: 0.000023
2024-03-26 11:43:57,864 DEV : loss 0.207626074552536 - f1-score (micro avg)  0.9134
2024-03-26 11:43:57,865 saving best model
2024-03-26 11:43:58,302 ----------------------------------------------------------------------------------------------------
2024-03-26 11:44:00,339 epoch 7 - iter 9/95 - loss 0.04872148 - time (sec): 2.04 - samples/sec: 1559.83 - lr: 0.000022 - momentum: 0.000000
2024-03-26 11:44:02,401 epoch 7 - iter 18/95 - loss 0.03440985 - time (sec): 4.10 - samples/sec: 1599.86 - lr: 0.000021 - momentum: 0.000000
2024-03-26 11:44:04,004 epoch 7 - iter 27/95 - loss 0.03042082 - time (sec): 5.70 - samples/sec: 1715.86 - lr: 0.000021 - momentum: 0.000000
2024-03-26 11:44:06,048 epoch 7 - iter 36/95 - loss 0.02930015 - time (sec): 7.75 - samples/sec: 1701.29 - lr: 0.000020 - momentum: 0.000000
2024-03-26 11:44:08,462 epoch 7 - iter 45/95 - loss 0.03113366 - time (sec): 10.16 - samples/sec: 1706.92 - lr: 0.000020 - momentum: 0.000000
2024-03-26 11:44:10,026 epoch 7 - iter 54/95 - loss 0.03136366 - time (sec): 11.72 - samples/sec: 1715.97 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:44:12,277 epoch 7 - iter 63/95 - loss 0.03476295 - time (sec): 13.97 - samples/sec: 1723.43 - lr: 0.000019 - momentum: 0.000000
2024-03-26 11:44:14,115 epoch 7 - iter 72/95 - loss 0.04176420 - time (sec): 15.81 - samples/sec: 1731.25 - lr: 0.000018 - momentum: 0.000000
2024-03-26 11:44:15,576 epoch 7 - iter 81/95 - loss 0.03898562 - time (sec): 17.27 - samples/sec: 1743.95 - lr: 0.000018 - momentum: 0.000000
2024-03-26 11:44:17,656 epoch 7 - iter 90/95 - loss 0.04260750 - time (sec): 19.35 - samples/sec: 1721.22 - lr: 0.000017 - momentum: 0.000000
2024-03-26 11:44:18,129 ----------------------------------------------------------------------------------------------------
2024-03-26 11:44:18,129 EPOCH 7 done: loss 0.0426 - lr: 0.000017
2024-03-26 11:44:19,058 DEV : loss 0.19692113995552063 - f1-score (micro avg)  0.9265
2024-03-26 11:44:19,059 saving best model
2024-03-26 11:44:19,476 ----------------------------------------------------------------------------------------------------
2024-03-26 11:44:21,361 epoch 8 - iter 9/95 - loss 0.02324695 - time (sec): 1.88 - samples/sec: 1701.92 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:44:23,922 epoch 8 - iter 18/95 - loss 0.02314308 - time (sec): 4.44 - samples/sec: 1666.58 - lr: 0.000016 - momentum: 0.000000
2024-03-26 11:44:25,761 epoch 8 - iter 27/95 - loss 0.02004589 - time (sec): 6.28 - samples/sec: 1693.42 - lr: 0.000015 - momentum: 0.000000
2024-03-26 11:44:27,364 epoch 8 - iter 36/95 - loss 0.02239681 - time (sec): 7.89 - samples/sec: 1681.07 - lr: 0.000015 - momentum: 0.000000
2024-03-26 11:44:28,916 epoch 8 - iter 45/95 - loss 0.02059954 - time (sec): 9.44 - samples/sec: 1711.02 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:44:30,633 epoch 8 - iter 54/95 - loss 0.02199162 - time (sec): 11.16 - samples/sec: 1729.07 - lr: 0.000014 - momentum: 0.000000
2024-03-26 11:44:32,899 epoch 8 - iter 63/95 - loss 0.03012961 - time (sec): 13.42 - samples/sec: 1724.26 - lr: 0.000013 - momentum: 0.000000
2024-03-26 11:44:35,247 epoch 8 - iter 72/95 - loss 0.03445933 - time (sec): 15.77 - samples/sec: 1701.85 - lr: 0.000013 - momentum: 0.000000
2024-03-26 11:44:36,980 epoch 8 - iter 81/95 - loss 0.04080364 - time (sec): 17.50 - samples/sec: 1701.89 - lr: 0.000012 - momentum: 0.000000
2024-03-26 11:44:38,290 epoch 8 - iter 90/95 - loss 0.03967245 - time (sec): 18.81 - samples/sec: 1745.22 - lr: 0.000012 - momentum: 0.000000
2024-03-26 11:44:39,220 ----------------------------------------------------------------------------------------------------
2024-03-26 11:44:39,220 EPOCH 8 done: loss 0.0382 - lr: 0.000012
2024-03-26 11:44:40,160 DEV : loss 0.20949774980545044 - f1-score (micro avg)  0.9225
2024-03-26 11:44:40,163 ----------------------------------------------------------------------------------------------------
2024-03-26 11:44:42,192 epoch 9 - iter 9/95 - loss 0.00709184 - time (sec): 2.03 - samples/sec: 1738.96 - lr: 0.000011 - momentum: 0.000000
2024-03-26 11:44:43,994 epoch 9 - iter 18/95 - loss 0.01806499 - time (sec): 3.83 - samples/sec: 1747.73 - lr: 0.000010 - momentum: 0.000000
2024-03-26 11:44:45,936 epoch 9 - iter 27/95 - loss 0.01890732 - time (sec): 5.77 - samples/sec: 1769.95 - lr: 0.000010 - momentum: 0.000000
2024-03-26 11:44:47,884 epoch 9 - iter 36/95 - loss 0.01966161 - time (sec): 7.72 - samples/sec: 1759.21 - lr: 0.000009 - momentum: 0.000000
2024-03-26 11:44:50,206 epoch 9 - iter 45/95 - loss 0.01858568 - time (sec): 10.04 - samples/sec: 1687.90 - lr: 0.000009 - momentum: 0.000000
2024-03-26 11:44:52,199 epoch 9 - iter 54/95 - loss 0.02246655 - time (sec): 12.04 - samples/sec: 1677.38 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:44:54,161 epoch 9 - iter 63/95 - loss 0.02234549 - time (sec): 14.00 - samples/sec: 1686.83 - lr: 0.000008 - momentum: 0.000000
2024-03-26 11:44:56,178 epoch 9 - iter 72/95 - loss 0.02615055 - time (sec): 16.01 - samples/sec: 1682.78 - lr: 0.000007 - momentum: 0.000000
2024-03-26 11:44:57,469 epoch 9 - iter 81/95 - loss 0.02729114 - time (sec): 17.31 - samples/sec: 1706.05 - lr: 0.000007 - momentum: 0.000000
2024-03-26 11:44:58,945 epoch 9 - iter 90/95 - loss 0.03054163 - time (sec): 18.78 - samples/sec: 1725.62 - lr: 0.000006 - momentum: 0.000000
2024-03-26 11:44:59,928 ----------------------------------------------------------------------------------------------------
2024-03-26 11:44:59,928 EPOCH 9 done: loss 0.0304 - lr: 0.000006
2024-03-26 11:45:00,895 DEV : loss 0.2351907193660736 - f1-score (micro avg)  0.9233
2024-03-26 11:45:00,897 ----------------------------------------------------------------------------------------------------
2024-03-26 11:45:03,122 epoch 10 - iter 9/95 - loss 0.01271718 - time (sec): 2.22 - samples/sec: 1714.23 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:45:04,449 epoch 10 - iter 18/95 - loss 0.01054011 - time (sec): 3.55 - samples/sec: 1823.00 - lr: 0.000005 - momentum: 0.000000
2024-03-26 11:45:05,826 epoch 10 - iter 27/95 - loss 0.02501471 - time (sec): 4.93 - samples/sec: 1924.44 - lr: 0.000004 - momentum: 0.000000
2024-03-26 11:45:07,249 epoch 10 - iter 36/95 - loss 0.02420587 - time (sec): 6.35 - samples/sec: 1942.16 - lr: 0.000004 - momentum: 0.000000
2024-03-26 11:45:09,261 epoch 10 - iter 45/95 - loss 0.01990443 - time (sec): 8.36 - samples/sec: 1904.24 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:45:10,946 epoch 10 - iter 54/95 - loss 0.01895113 - time (sec): 10.05 - samples/sec: 1886.59 - lr: 0.000003 - momentum: 0.000000
2024-03-26 11:45:13,542 epoch 10 - iter 63/95 - loss 0.02078384 - time (sec): 12.64 - samples/sec: 1818.35 - lr: 0.000002 - momentum: 0.000000
2024-03-26 11:45:14,847 epoch 10 - iter 72/95 - loss 0.01996382 - time (sec): 13.95 - samples/sec: 1828.09 - lr: 0.000002 - momentum: 0.000000
2024-03-26 11:45:17,272 epoch 10 - iter 81/95 - loss 0.01848243 - time (sec): 16.37 - samples/sec: 1779.04 - lr: 0.000001 - momentum: 0.000000
2024-03-26 11:45:19,576 epoch 10 - iter 90/95 - loss 0.02313635 - time (sec): 18.68 - samples/sec: 1760.78 - lr: 0.000001 - momentum: 0.000000
2024-03-26 11:45:20,666 ----------------------------------------------------------------------------------------------------
2024-03-26 11:45:20,666 EPOCH 10 done: loss 0.0239 - lr: 0.000001
2024-03-26 11:45:21,614 DEV : loss 0.23291419446468353 - f1-score (micro avg)  0.9301
2024-03-26 11:45:21,617 saving best model
2024-03-26 11:45:22,374 ----------------------------------------------------------------------------------------------------
2024-03-26 11:45:22,374 Loading model from best epoch ...
2024-03-26 11:45:23,249 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
2024-03-26 11:45:24,026 
Results:
- F-score (micro) 0.9121
- F-score (macro) 0.6937
- Accuracy 0.8408

By class:
              precision    recall  f1-score   support

 Unternehmen     0.9154    0.8947    0.9049       266
 Auslagerung     0.8626    0.9076    0.8845       249
         Ort     0.9779    0.9925    0.9852       134
    Software     0.0000    0.0000    0.0000         0

   micro avg     0.9045    0.9199    0.9121       649
   macro avg     0.6890    0.6987    0.6937       649
weighted avg     0.9080    0.9199    0.9137       649

2024-03-26 11:45:24,026 ----------------------------------------------------------------------------------------------------