gazquez commited on
Commit
6a04f01
1 Parent(s): c0c4feb

Added loggings

Browse files
Files changed (2) hide show
  1. README.md +2 -1
  2. training.log +408 -0
README.md CHANGED
@@ -4,8 +4,9 @@ language:
4
  license: isc
5
  library_name: flair
6
  tags:
7
- - token-classification
8
  - flair
 
 
9
  metrics:
10
  - f1
11
  - precision
 
4
  license: isc
5
  library_name: flair
6
  tags:
 
7
  - flair
8
+ - token-classification
9
+ - sequence-tagger-model
10
  metrics:
11
  - f1
12
  - precision
training.log ADDED
@@ -0,0 +1,408 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2022-10-01 00:23:25,105 ----------------------------------------------------------------------------------------------------
2
+ 2022-10-01 00:23:25,107 Model: "SequenceTagger(
3
+ (embeddings): StackedEmbeddings(
4
+ (list_embedding_0): TransformerWordEmbeddings(
5
+ (model): BertModel(
6
+ (embeddings): BertEmbeddings(
7
+ (word_embeddings): Embedding(119547, 768, padding_idx=0)
8
+ (position_embeddings): Embedding(512, 768)
9
+ (token_type_embeddings): Embedding(2, 768)
10
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
11
+ (dropout): Dropout(p=0.1, inplace=False)
12
+ )
13
+ (encoder): BertEncoder(
14
+ (layer): ModuleList(
15
+ (0): BertLayer(
16
+ (attention): BertAttention(
17
+ (self): BertSelfAttention(
18
+ (query): Linear(in_features=768, out_features=768, bias=True)
19
+ (key): Linear(in_features=768, out_features=768, bias=True)
20
+ (value): Linear(in_features=768, out_features=768, bias=True)
21
+ (dropout): Dropout(p=0.1, inplace=False)
22
+ )
23
+ (output): BertSelfOutput(
24
+ (dense): Linear(in_features=768, out_features=768, bias=True)
25
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
26
+ (dropout): Dropout(p=0.1, inplace=False)
27
+ )
28
+ )
29
+ (intermediate): BertIntermediate(
30
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
31
+ (intermediate_act_fn): GELUActivation()
32
+ )
33
+ (output): BertOutput(
34
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
35
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
36
+ (dropout): Dropout(p=0.1, inplace=False)
37
+ )
38
+ )
39
+ (1): BertLayer(
40
+ (attention): BertAttention(
41
+ (self): BertSelfAttention(
42
+ (query): Linear(in_features=768, out_features=768, bias=True)
43
+ (key): Linear(in_features=768, out_features=768, bias=True)
44
+ (value): Linear(in_features=768, out_features=768, bias=True)
45
+ (dropout): Dropout(p=0.1, inplace=False)
46
+ )
47
+ (output): BertSelfOutput(
48
+ (dense): Linear(in_features=768, out_features=768, bias=True)
49
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
50
+ (dropout): Dropout(p=0.1, inplace=False)
51
+ )
52
+ )
53
+ (intermediate): BertIntermediate(
54
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
55
+ (intermediate_act_fn): GELUActivation()
56
+ )
57
+ (output): BertOutput(
58
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
59
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
60
+ (dropout): Dropout(p=0.1, inplace=False)
61
+ )
62
+ )
63
+ (2): BertLayer(
64
+ (attention): BertAttention(
65
+ (self): BertSelfAttention(
66
+ (query): Linear(in_features=768, out_features=768, bias=True)
67
+ (key): Linear(in_features=768, out_features=768, bias=True)
68
+ (value): Linear(in_features=768, out_features=768, bias=True)
69
+ (dropout): Dropout(p=0.1, inplace=False)
70
+ )
71
+ (output): BertSelfOutput(
72
+ (dense): Linear(in_features=768, out_features=768, bias=True)
73
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
74
+ (dropout): Dropout(p=0.1, inplace=False)
75
+ )
76
+ )
77
+ (intermediate): BertIntermediate(
78
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
79
+ (intermediate_act_fn): GELUActivation()
80
+ )
81
+ (output): BertOutput(
82
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
83
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
84
+ (dropout): Dropout(p=0.1, inplace=False)
85
+ )
86
+ )
87
+ (3): BertLayer(
88
+ (attention): BertAttention(
89
+ (self): BertSelfAttention(
90
+ (query): Linear(in_features=768, out_features=768, bias=True)
91
+ (key): Linear(in_features=768, out_features=768, bias=True)
92
+ (value): Linear(in_features=768, out_features=768, bias=True)
93
+ (dropout): Dropout(p=0.1, inplace=False)
94
+ )
95
+ (output): BertSelfOutput(
96
+ (dense): Linear(in_features=768, out_features=768, bias=True)
97
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
98
+ (dropout): Dropout(p=0.1, inplace=False)
99
+ )
100
+ )
101
+ (intermediate): BertIntermediate(
102
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
103
+ (intermediate_act_fn): GELUActivation()
104
+ )
105
+ (output): BertOutput(
106
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
107
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
108
+ (dropout): Dropout(p=0.1, inplace=False)
109
+ )
110
+ )
111
+ (4): BertLayer(
112
+ (attention): BertAttention(
113
+ (self): BertSelfAttention(
114
+ (query): Linear(in_features=768, out_features=768, bias=True)
115
+ (key): Linear(in_features=768, out_features=768, bias=True)
116
+ (value): Linear(in_features=768, out_features=768, bias=True)
117
+ (dropout): Dropout(p=0.1, inplace=False)
118
+ )
119
+ (output): BertSelfOutput(
120
+ (dense): Linear(in_features=768, out_features=768, bias=True)
121
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
122
+ (dropout): Dropout(p=0.1, inplace=False)
123
+ )
124
+ )
125
+ (intermediate): BertIntermediate(
126
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
127
+ (intermediate_act_fn): GELUActivation()
128
+ )
129
+ (output): BertOutput(
130
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
131
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
132
+ (dropout): Dropout(p=0.1, inplace=False)
133
+ )
134
+ )
135
+ (5): BertLayer(
136
+ (attention): BertAttention(
137
+ (self): BertSelfAttention(
138
+ (query): Linear(in_features=768, out_features=768, bias=True)
139
+ (key): Linear(in_features=768, out_features=768, bias=True)
140
+ (value): Linear(in_features=768, out_features=768, bias=True)
141
+ (dropout): Dropout(p=0.1, inplace=False)
142
+ )
143
+ (output): BertSelfOutput(
144
+ (dense): Linear(in_features=768, out_features=768, bias=True)
145
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
146
+ (dropout): Dropout(p=0.1, inplace=False)
147
+ )
148
+ )
149
+ (intermediate): BertIntermediate(
150
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
151
+ (intermediate_act_fn): GELUActivation()
152
+ )
153
+ (output): BertOutput(
154
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
155
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
156
+ (dropout): Dropout(p=0.1, inplace=False)
157
+ )
158
+ )
159
+ (6): BertLayer(
160
+ (attention): BertAttention(
161
+ (self): BertSelfAttention(
162
+ (query): Linear(in_features=768, out_features=768, bias=True)
163
+ (key): Linear(in_features=768, out_features=768, bias=True)
164
+ (value): Linear(in_features=768, out_features=768, bias=True)
165
+ (dropout): Dropout(p=0.1, inplace=False)
166
+ )
167
+ (output): BertSelfOutput(
168
+ (dense): Linear(in_features=768, out_features=768, bias=True)
169
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
170
+ (dropout): Dropout(p=0.1, inplace=False)
171
+ )
172
+ )
173
+ (intermediate): BertIntermediate(
174
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
175
+ (intermediate_act_fn): GELUActivation()
176
+ )
177
+ (output): BertOutput(
178
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
179
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
180
+ (dropout): Dropout(p=0.1, inplace=False)
181
+ )
182
+ )
183
+ (7): BertLayer(
184
+ (attention): BertAttention(
185
+ (self): BertSelfAttention(
186
+ (query): Linear(in_features=768, out_features=768, bias=True)
187
+ (key): Linear(in_features=768, out_features=768, bias=True)
188
+ (value): Linear(in_features=768, out_features=768, bias=True)
189
+ (dropout): Dropout(p=0.1, inplace=False)
190
+ )
191
+ (output): BertSelfOutput(
192
+ (dense): Linear(in_features=768, out_features=768, bias=True)
193
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
194
+ (dropout): Dropout(p=0.1, inplace=False)
195
+ )
196
+ )
197
+ (intermediate): BertIntermediate(
198
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
199
+ (intermediate_act_fn): GELUActivation()
200
+ )
201
+ (output): BertOutput(
202
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
203
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
204
+ (dropout): Dropout(p=0.1, inplace=False)
205
+ )
206
+ )
207
+ (8): BertLayer(
208
+ (attention): BertAttention(
209
+ (self): BertSelfAttention(
210
+ (query): Linear(in_features=768, out_features=768, bias=True)
211
+ (key): Linear(in_features=768, out_features=768, bias=True)
212
+ (value): Linear(in_features=768, out_features=768, bias=True)
213
+ (dropout): Dropout(p=0.1, inplace=False)
214
+ )
215
+ (output): BertSelfOutput(
216
+ (dense): Linear(in_features=768, out_features=768, bias=True)
217
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
218
+ (dropout): Dropout(p=0.1, inplace=False)
219
+ )
220
+ )
221
+ (intermediate): BertIntermediate(
222
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
223
+ (intermediate_act_fn): GELUActivation()
224
+ )
225
+ (output): BertOutput(
226
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
227
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
228
+ (dropout): Dropout(p=0.1, inplace=False)
229
+ )
230
+ )
231
+ (9): BertLayer(
232
+ (attention): BertAttention(
233
+ (self): BertSelfAttention(
234
+ (query): Linear(in_features=768, out_features=768, bias=True)
235
+ (key): Linear(in_features=768, out_features=768, bias=True)
236
+ (value): Linear(in_features=768, out_features=768, bias=True)
237
+ (dropout): Dropout(p=0.1, inplace=False)
238
+ )
239
+ (output): BertSelfOutput(
240
+ (dense): Linear(in_features=768, out_features=768, bias=True)
241
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
242
+ (dropout): Dropout(p=0.1, inplace=False)
243
+ )
244
+ )
245
+ (intermediate): BertIntermediate(
246
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
247
+ (intermediate_act_fn): GELUActivation()
248
+ )
249
+ (output): BertOutput(
250
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
251
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
252
+ (dropout): Dropout(p=0.1, inplace=False)
253
+ )
254
+ )
255
+ (10): BertLayer(
256
+ (attention): BertAttention(
257
+ (self): BertSelfAttention(
258
+ (query): Linear(in_features=768, out_features=768, bias=True)
259
+ (key): Linear(in_features=768, out_features=768, bias=True)
260
+ (value): Linear(in_features=768, out_features=768, bias=True)
261
+ (dropout): Dropout(p=0.1, inplace=False)
262
+ )
263
+ (output): BertSelfOutput(
264
+ (dense): Linear(in_features=768, out_features=768, bias=True)
265
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
266
+ (dropout): Dropout(p=0.1, inplace=False)
267
+ )
268
+ )
269
+ (intermediate): BertIntermediate(
270
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
271
+ (intermediate_act_fn): GELUActivation()
272
+ )
273
+ (output): BertOutput(
274
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
275
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
276
+ (dropout): Dropout(p=0.1, inplace=False)
277
+ )
278
+ )
279
+ (11): BertLayer(
280
+ (attention): BertAttention(
281
+ (self): BertSelfAttention(
282
+ (query): Linear(in_features=768, out_features=768, bias=True)
283
+ (key): Linear(in_features=768, out_features=768, bias=True)
284
+ (value): Linear(in_features=768, out_features=768, bias=True)
285
+ (dropout): Dropout(p=0.1, inplace=False)
286
+ )
287
+ (output): BertSelfOutput(
288
+ (dense): Linear(in_features=768, out_features=768, bias=True)
289
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
290
+ (dropout): Dropout(p=0.1, inplace=False)
291
+ )
292
+ )
293
+ (intermediate): BertIntermediate(
294
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
295
+ (intermediate_act_fn): GELUActivation()
296
+ )
297
+ (output): BertOutput(
298
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
299
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
300
+ (dropout): Dropout(p=0.1, inplace=False)
301
+ )
302
+ )
303
+ )
304
+ )
305
+ (pooler): BertPooler(
306
+ (dense): Linear(in_features=768, out_features=768, bias=True)
307
+ (activation): Tanh()
308
+ )
309
+ )
310
+ )
311
+ (list_embedding_1): FlairEmbeddings(
312
+ (lm): LanguageModel(
313
+ (drop): Dropout(p=0.5, inplace=False)
314
+ (encoder): Embedding(275, 100)
315
+ (rnn): LSTM(100, 1024)
316
+ (decoder): Linear(in_features=1024, out_features=275, bias=True)
317
+ )
318
+ )
319
+ (list_embedding_2): FlairEmbeddings(
320
+ (lm): LanguageModel(
321
+ (drop): Dropout(p=0.5, inplace=False)
322
+ (encoder): Embedding(275, 100)
323
+ (rnn): LSTM(100, 1024)
324
+ (decoder): Linear(in_features=1024, out_features=275, bias=True)
325
+ )
326
+ )
327
+ )
328
+ (word_dropout): WordDropout(p=0.05)
329
+ (locked_dropout): LockedDropout(p=0.5)
330
+ (embedding2nn): Linear(in_features=2816, out_features=2816, bias=True)
331
+ (linear): Linear(in_features=2816, out_features=13, bias=True)
332
+ (loss_function): CrossEntropyLoss()
333
+ )"
334
+ 2022-10-01 00:23:25,114 ----------------------------------------------------------------------------------------------------
335
+ 2022-10-01 00:23:25,115 Corpus: "Corpus: 70000 train + 15000 dev + 15000 test sentences"
336
+ 2022-10-01 00:23:25,115 ----------------------------------------------------------------------------------------------------
337
+ 2022-10-01 00:23:25,115 Parameters:
338
+ 2022-10-01 00:23:25,116 - learning_rate: "0.010000"
339
+ 2022-10-01 00:23:25,116 - mini_batch_size: "8"
340
+ 2022-10-01 00:23:25,116 - patience: "3"
341
+ 2022-10-01 00:23:25,116 - anneal_factor: "0.5"
342
+ 2022-10-01 00:23:25,116 - max_epochs: "2"
343
+ 2022-10-01 00:23:25,116 - shuffle: "True"
344
+ 2022-10-01 00:23:25,117 - train_with_dev: "False"
345
+ 2022-10-01 00:23:25,117 - batch_growth_annealing: "False"
346
+ 2022-10-01 00:23:25,117 ----------------------------------------------------------------------------------------------------
347
+ 2022-10-01 00:23:25,117 Model training base path: "c:\Users\Ivan\Documents\Projects\Yoda\NER\model\flair\src\..\models\mix_trans_word"
348
+ 2022-10-01 00:23:25,117 ----------------------------------------------------------------------------------------------------
349
+ 2022-10-01 00:23:25,118 Device: cuda:0
350
+ 2022-10-01 00:23:25,118 ----------------------------------------------------------------------------------------------------
351
+ 2022-10-01 00:23:25,118 Embeddings storage mode: cpu
352
+ 2022-10-01 00:23:25,119 ----------------------------------------------------------------------------------------------------
353
+ 2022-10-01 00:25:10,652 epoch 1 - iter 875/8750 - loss 0.52734710 - samples/sec: 66.36 - lr: 0.010000
354
+ 2022-10-01 00:26:56,050 epoch 1 - iter 1750/8750 - loss 0.40571165 - samples/sec: 66.45 - lr: 0.010000
355
+ 2022-10-01 00:28:42,758 epoch 1 - iter 2625/8750 - loss 0.33981350 - samples/sec: 65.63 - lr: 0.010000
356
+ 2022-10-01 00:30:27,826 epoch 1 - iter 3500/8750 - loss 0.29553411 - samples/sec: 66.66 - lr: 0.010000
357
+ 2022-10-01 00:32:13,605 epoch 1 - iter 4375/8750 - loss 0.26472648 - samples/sec: 66.21 - lr: 0.010000
358
+ 2022-10-01 00:33:58,962 epoch 1 - iter 5250/8750 - loss 0.24119392 - samples/sec: 66.47 - lr: 0.010000
359
+ 2022-10-01 00:35:44,264 epoch 1 - iter 6125/8750 - loss 0.22350560 - samples/sec: 66.50 - lr: 0.010000
360
+ 2022-10-01 00:37:29,676 epoch 1 - iter 7000/8750 - loss 0.20938707 - samples/sec: 66.43 - lr: 0.010000
361
+ 2022-10-01 00:39:17,828 epoch 1 - iter 7875/8750 - loss 0.19801233 - samples/sec: 64.75 - lr: 0.010000
362
+ 2022-10-01 00:41:05,621 epoch 1 - iter 8750/8750 - loss 0.18900810 - samples/sec: 64.98 - lr: 0.010000
363
+ 2022-10-01 00:41:05,624 ----------------------------------------------------------------------------------------------------
364
+ 2022-10-01 00:41:05,624 EPOCH 1 done: loss 0.1890 - lr 0.010000
365
+ 2022-10-01 00:43:16,083 Evaluating as a multi-label problem: False
366
+ 2022-10-01 00:43:16,227 DEV : loss 0.06317088007926941 - f1-score (micro avg) 0.9585
367
+ 2022-10-01 00:43:17,308 BAD EPOCHS (no improvement): 0
368
+ 2022-10-01 00:43:17,309 saving best model
369
+ 2022-10-01 00:43:18,885 ----------------------------------------------------------------------------------------------------
370
+ 2022-10-01 00:45:00,373 epoch 2 - iter 875/8750 - loss 0.09938527 - samples/sec: 69.02 - lr: 0.010000
371
+ 2022-10-01 00:46:39,918 epoch 2 - iter 1750/8750 - loss 0.09782604 - samples/sec: 70.36 - lr: 0.010000
372
+ 2022-10-01 00:48:19,288 epoch 2 - iter 2625/8750 - loss 0.09732946 - samples/sec: 70.50 - lr: 0.010000
373
+ 2022-10-01 00:49:56,913 epoch 2 - iter 3500/8750 - loss 0.09652202 - samples/sec: 71.76 - lr: 0.010000
374
+ 2022-10-01 00:51:35,781 epoch 2 - iter 4375/8750 - loss 0.09592801 - samples/sec: 70.86 - lr: 0.010000
375
+ 2022-10-01 00:53:12,838 epoch 2 - iter 5250/8750 - loss 0.09478132 - samples/sec: 72.17 - lr: 0.010000
376
+ 2022-10-01 00:54:49,247 epoch 2 - iter 6125/8750 - loss 0.09405506 - samples/sec: 72.65 - lr: 0.010000
377
+ 2022-10-01 00:56:26,656 epoch 2 - iter 7000/8750 - loss 0.09270363 - samples/sec: 71.90 - lr: 0.010000
378
+ 2022-10-01 00:58:04,050 epoch 2 - iter 7875/8750 - loss 0.09222568 - samples/sec: 71.92 - lr: 0.010000
379
+ 2022-10-01 00:59:41,351 epoch 2 - iter 8750/8750 - loss 0.09155321 - samples/sec: 71.98 - lr: 0.010000
380
+ 2022-10-01 00:59:41,359 ----------------------------------------------------------------------------------------------------
381
+ 2022-10-01 00:59:41,360 EPOCH 2 done: loss 0.0916 - lr 0.010000
382
+ 2022-10-01 01:01:38,941 Evaluating as a multi-label problem: False
383
+ 2022-10-01 01:01:39,054 DEV : loss 0.04371843859553337 - f1-score (micro avg) 0.9749
384
+ 2022-10-01 01:01:40,056 BAD EPOCHS (no improvement): 0
385
+ 2022-10-01 01:01:40,058 saving best model
386
+ 2022-10-01 01:01:42,979 ----------------------------------------------------------------------------------------------------
387
+ 2022-10-01 01:01:42,986 loading file c:\Users\Ivan\Documents\Projects\Yoda\NER\model\flair\src\..\models\mix_trans_word\best-model.pt
388
+ 2022-10-01 01:01:46,879 SequenceTagger predicts: Dictionary with 13 tags: O, S-brand, B-brand, E-brand, I-brand, S-size, B-size, E-size, I-size, S-color, B-color, E-color, I-color
389
+ 2022-10-01 01:03:40,258 Evaluating as a multi-label problem: False
390
+ 2022-10-01 01:03:40,388 0.9719 0.9777 0.9748 0.951
391
+ 2022-10-01 01:03:40,389
392
+ Results:
393
+ - F-score (micro) 0.9748
394
+ - F-score (macro) 0.9624
395
+ - Accuracy 0.951
396
+
397
+ By class:
398
+ precision recall f1-score support
399
+
400
+ brand 0.9779 0.9849 0.9814 11779
401
+ size 0.9780 0.9821 0.9800 3125
402
+ color 0.9249 0.9264 0.9256 1915
403
+
404
+ micro avg 0.9719 0.9777 0.9748 16819
405
+ macro avg 0.9603 0.9644 0.9624 16819
406
+ weighted avg 0.9719 0.9777 0.9748 16819
407
+
408
+ 2022-10-01 01:03:40,391 ----------------------------------------------------------------------------------------------------