stefan-it commited on
Commit
95f036a
1 Parent(s): 823e8fa

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +265 -0
training.log ADDED
@@ -0,0 +1,265 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-03-26 15:31:07,433 ----------------------------------------------------------------------------------------------------
2
+ 2024-03-26 15:31:07,433 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(31103, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2024-03-26 15:31:07,434 ----------------------------------------------------------------------------------------------------
51
+ 2024-03-26 15:31:07,434 Corpus: 758 train + 94 dev + 96 test sentences
52
+ 2024-03-26 15:31:07,434 ----------------------------------------------------------------------------------------------------
53
+ 2024-03-26 15:31:07,434 Train: 758 sentences
54
+ 2024-03-26 15:31:07,434 (train_with_dev=False, train_with_test=False)
55
+ 2024-03-26 15:31:07,434 ----------------------------------------------------------------------------------------------------
56
+ 2024-03-26 15:31:07,434 Training Params:
57
+ 2024-03-26 15:31:07,434 - learning_rate: "3e-05"
58
+ 2024-03-26 15:31:07,434 - mini_batch_size: "16"
59
+ 2024-03-26 15:31:07,434 - max_epochs: "10"
60
+ 2024-03-26 15:31:07,434 - shuffle: "True"
61
+ 2024-03-26 15:31:07,434 ----------------------------------------------------------------------------------------------------
62
+ 2024-03-26 15:31:07,434 Plugins:
63
+ 2024-03-26 15:31:07,434 - TensorboardLogger
64
+ 2024-03-26 15:31:07,434 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2024-03-26 15:31:07,434 ----------------------------------------------------------------------------------------------------
66
+ 2024-03-26 15:31:07,434 Final evaluation on model from best epoch (best-model.pt)
67
+ 2024-03-26 15:31:07,434 - metric: "('micro avg', 'f1-score')"
68
+ 2024-03-26 15:31:07,434 ----------------------------------------------------------------------------------------------------
69
+ 2024-03-26 15:31:07,434 Computation:
70
+ 2024-03-26 15:31:07,434 - compute on device: cuda:0
71
+ 2024-03-26 15:31:07,434 - embedding storage: none
72
+ 2024-03-26 15:31:07,434 ----------------------------------------------------------------------------------------------------
73
+ 2024-03-26 15:31:07,434 Model training base path: "flair-co-funer-german_dbmdz_bert_base-bs16-e10-lr3e-05-2"
74
+ 2024-03-26 15:31:07,434 ----------------------------------------------------------------------------------------------------
75
+ 2024-03-26 15:31:07,434 ----------------------------------------------------------------------------------------------------
76
+ 2024-03-26 15:31:07,434 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2024-03-26 15:31:09,153 epoch 1 - iter 4/48 - loss 3.07690001 - time (sec): 1.72 - samples/sec: 1757.53 - lr: 0.000002 - momentum: 0.000000
78
+ 2024-03-26 15:31:11,233 epoch 1 - iter 8/48 - loss 3.05496224 - time (sec): 3.80 - samples/sec: 1634.29 - lr: 0.000004 - momentum: 0.000000
79
+ 2024-03-26 15:31:13,072 epoch 1 - iter 12/48 - loss 2.97386894 - time (sec): 5.64 - samples/sec: 1581.12 - lr: 0.000007 - momentum: 0.000000
80
+ 2024-03-26 15:31:15,073 epoch 1 - iter 16/48 - loss 2.85293571 - time (sec): 7.64 - samples/sec: 1588.44 - lr: 0.000009 - momentum: 0.000000
81
+ 2024-03-26 15:31:17,248 epoch 1 - iter 20/48 - loss 2.74493363 - time (sec): 9.81 - samples/sec: 1557.15 - lr: 0.000012 - momentum: 0.000000
82
+ 2024-03-26 15:31:20,254 epoch 1 - iter 24/48 - loss 2.64581724 - time (sec): 12.82 - samples/sec: 1418.04 - lr: 0.000014 - momentum: 0.000000
83
+ 2024-03-26 15:31:22,640 epoch 1 - iter 28/48 - loss 2.52600763 - time (sec): 15.21 - samples/sec: 1401.69 - lr: 0.000017 - momentum: 0.000000
84
+ 2024-03-26 15:31:23,456 epoch 1 - iter 32/48 - loss 2.44927452 - time (sec): 16.02 - samples/sec: 1457.37 - lr: 0.000019 - momentum: 0.000000
85
+ 2024-03-26 15:31:24,710 epoch 1 - iter 36/48 - loss 2.36202901 - time (sec): 17.28 - samples/sec: 1513.78 - lr: 0.000022 - momentum: 0.000000
86
+ 2024-03-26 15:31:26,567 epoch 1 - iter 40/48 - loss 2.28016152 - time (sec): 19.13 - samples/sec: 1520.48 - lr: 0.000024 - momentum: 0.000000
87
+ 2024-03-26 15:31:28,440 epoch 1 - iter 44/48 - loss 2.19008437 - time (sec): 21.01 - samples/sec: 1521.10 - lr: 0.000027 - momentum: 0.000000
88
+ 2024-03-26 15:31:29,788 epoch 1 - iter 48/48 - loss 2.11466347 - time (sec): 22.35 - samples/sec: 1542.10 - lr: 0.000029 - momentum: 0.000000
89
+ 2024-03-26 15:31:29,788 ----------------------------------------------------------------------------------------------------
90
+ 2024-03-26 15:31:29,788 EPOCH 1 done: loss 2.1147 - lr: 0.000029
91
+ 2024-03-26 15:31:30,611 DEV : loss 0.8351971507072449 - f1-score (micro avg) 0.4472
92
+ 2024-03-26 15:31:30,612 saving best model
93
+ 2024-03-26 15:31:30,882 ----------------------------------------------------------------------------------------------------
94
+ 2024-03-26 15:31:32,192 epoch 2 - iter 4/48 - loss 1.15516706 - time (sec): 1.31 - samples/sec: 2214.60 - lr: 0.000030 - momentum: 0.000000
95
+ 2024-03-26 15:31:34,020 epoch 2 - iter 8/48 - loss 0.97683226 - time (sec): 3.14 - samples/sec: 1943.54 - lr: 0.000030 - momentum: 0.000000
96
+ 2024-03-26 15:31:37,458 epoch 2 - iter 12/48 - loss 0.86710735 - time (sec): 6.58 - samples/sec: 1547.82 - lr: 0.000029 - momentum: 0.000000
97
+ 2024-03-26 15:31:39,937 epoch 2 - iter 16/48 - loss 0.80348775 - time (sec): 9.05 - samples/sec: 1471.03 - lr: 0.000029 - momentum: 0.000000
98
+ 2024-03-26 15:31:42,595 epoch 2 - iter 20/48 - loss 0.75215888 - time (sec): 11.71 - samples/sec: 1418.35 - lr: 0.000029 - momentum: 0.000000
99
+ 2024-03-26 15:31:44,488 epoch 2 - iter 24/48 - loss 0.70436302 - time (sec): 13.61 - samples/sec: 1416.97 - lr: 0.000028 - momentum: 0.000000
100
+ 2024-03-26 15:31:46,268 epoch 2 - iter 28/48 - loss 0.69084455 - time (sec): 15.39 - samples/sec: 1425.51 - lr: 0.000028 - momentum: 0.000000
101
+ 2024-03-26 15:31:47,991 epoch 2 - iter 32/48 - loss 0.67192253 - time (sec): 17.11 - samples/sec: 1438.14 - lr: 0.000028 - momentum: 0.000000
102
+ 2024-03-26 15:31:49,842 epoch 2 - iter 36/48 - loss 0.65489588 - time (sec): 18.96 - samples/sec: 1446.86 - lr: 0.000028 - momentum: 0.000000
103
+ 2024-03-26 15:31:50,863 epoch 2 - iter 40/48 - loss 0.63872466 - time (sec): 19.98 - samples/sec: 1494.02 - lr: 0.000027 - momentum: 0.000000
104
+ 2024-03-26 15:31:52,297 epoch 2 - iter 44/48 - loss 0.62970353 - time (sec): 21.41 - samples/sec: 1513.73 - lr: 0.000027 - momentum: 0.000000
105
+ 2024-03-26 15:31:53,824 epoch 2 - iter 48/48 - loss 0.61156947 - time (sec): 22.94 - samples/sec: 1502.60 - lr: 0.000027 - momentum: 0.000000
106
+ 2024-03-26 15:31:53,824 ----------------------------------------------------------------------------------------------------
107
+ 2024-03-26 15:31:53,824 EPOCH 2 done: loss 0.6116 - lr: 0.000027
108
+ 2024-03-26 15:31:54,747 DEV : loss 0.33092811703681946 - f1-score (micro avg) 0.8046
109
+ 2024-03-26 15:31:54,748 saving best model
110
+ 2024-03-26 15:31:55,217 ----------------------------------------------------------------------------------------------------
111
+ 2024-03-26 15:31:57,865 epoch 3 - iter 4/48 - loss 0.33780163 - time (sec): 2.65 - samples/sec: 1136.57 - lr: 0.000026 - momentum: 0.000000
112
+ 2024-03-26 15:32:00,011 epoch 3 - iter 8/48 - loss 0.33754782 - time (sec): 4.79 - samples/sec: 1324.89 - lr: 0.000026 - momentum: 0.000000
113
+ 2024-03-26 15:32:01,586 epoch 3 - iter 12/48 - loss 0.35771746 - time (sec): 6.37 - samples/sec: 1393.06 - lr: 0.000026 - momentum: 0.000000
114
+ 2024-03-26 15:32:03,330 epoch 3 - iter 16/48 - loss 0.33541111 - time (sec): 8.11 - samples/sec: 1401.05 - lr: 0.000026 - momentum: 0.000000
115
+ 2024-03-26 15:32:04,480 epoch 3 - iter 20/48 - loss 0.33566464 - time (sec): 9.26 - samples/sec: 1477.25 - lr: 0.000025 - momentum: 0.000000
116
+ 2024-03-26 15:32:06,324 epoch 3 - iter 24/48 - loss 0.34059768 - time (sec): 11.11 - samples/sec: 1481.53 - lr: 0.000025 - momentum: 0.000000
117
+ 2024-03-26 15:32:08,770 epoch 3 - iter 28/48 - loss 0.33468202 - time (sec): 13.55 - samples/sec: 1427.62 - lr: 0.000025 - momentum: 0.000000
118
+ 2024-03-26 15:32:10,614 epoch 3 - iter 32/48 - loss 0.33237610 - time (sec): 15.40 - samples/sec: 1437.92 - lr: 0.000025 - momentum: 0.000000
119
+ 2024-03-26 15:32:12,049 epoch 3 - iter 36/48 - loss 0.32281669 - time (sec): 16.83 - samples/sec: 1472.21 - lr: 0.000024 - momentum: 0.000000
120
+ 2024-03-26 15:32:14,321 epoch 3 - iter 40/48 - loss 0.31113411 - time (sec): 19.10 - samples/sec: 1445.24 - lr: 0.000024 - momentum: 0.000000
121
+ 2024-03-26 15:32:17,590 epoch 3 - iter 44/48 - loss 0.28678284 - time (sec): 22.37 - samples/sec: 1440.30 - lr: 0.000024 - momentum: 0.000000
122
+ 2024-03-26 15:32:18,841 epoch 3 - iter 48/48 - loss 0.28114112 - time (sec): 23.62 - samples/sec: 1459.23 - lr: 0.000023 - momentum: 0.000000
123
+ 2024-03-26 15:32:18,841 ----------------------------------------------------------------------------------------------------
124
+ 2024-03-26 15:32:18,842 EPOCH 3 done: loss 0.2811 - lr: 0.000023
125
+ 2024-03-26 15:32:19,759 DEV : loss 0.2615453898906708 - f1-score (micro avg) 0.8483
126
+ 2024-03-26 15:32:19,761 saving best model
127
+ 2024-03-26 15:32:20,220 ----------------------------------------------------------------------------------------------------
128
+ 2024-03-26 15:32:21,779 epoch 4 - iter 4/48 - loss 0.27731467 - time (sec): 1.56 - samples/sec: 1636.53 - lr: 0.000023 - momentum: 0.000000
129
+ 2024-03-26 15:32:23,988 epoch 4 - iter 8/48 - loss 0.23357535 - time (sec): 3.77 - samples/sec: 1590.61 - lr: 0.000023 - momentum: 0.000000
130
+ 2024-03-26 15:32:25,233 epoch 4 - iter 12/48 - loss 0.21777199 - time (sec): 5.01 - samples/sec: 1667.59 - lr: 0.000023 - momentum: 0.000000
131
+ 2024-03-26 15:32:27,454 epoch 4 - iter 16/48 - loss 0.21671924 - time (sec): 7.23 - samples/sec: 1558.49 - lr: 0.000022 - momentum: 0.000000
132
+ 2024-03-26 15:32:29,973 epoch 4 - iter 20/48 - loss 0.20437028 - time (sec): 9.75 - samples/sec: 1433.65 - lr: 0.000022 - momentum: 0.000000
133
+ 2024-03-26 15:32:31,983 epoch 4 - iter 24/48 - loss 0.20982545 - time (sec): 11.76 - samples/sec: 1431.14 - lr: 0.000022 - momentum: 0.000000
134
+ 2024-03-26 15:32:34,093 epoch 4 - iter 28/48 - loss 0.20703194 - time (sec): 13.87 - samples/sec: 1434.09 - lr: 0.000022 - momentum: 0.000000
135
+ 2024-03-26 15:32:36,637 epoch 4 - iter 32/48 - loss 0.20143413 - time (sec): 16.42 - samples/sec: 1404.71 - lr: 0.000021 - momentum: 0.000000
136
+ 2024-03-26 15:32:39,428 epoch 4 - iter 36/48 - loss 0.19267077 - time (sec): 19.21 - samples/sec: 1392.72 - lr: 0.000021 - momentum: 0.000000
137
+ 2024-03-26 15:32:41,108 epoch 4 - iter 40/48 - loss 0.18832931 - time (sec): 20.89 - samples/sec: 1392.91 - lr: 0.000021 - momentum: 0.000000
138
+ 2024-03-26 15:32:43,083 epoch 4 - iter 44/48 - loss 0.18668573 - time (sec): 22.86 - samples/sec: 1396.30 - lr: 0.000020 - momentum: 0.000000
139
+ 2024-03-26 15:32:44,737 epoch 4 - iter 48/48 - loss 0.18524935 - time (sec): 24.52 - samples/sec: 1406.10 - lr: 0.000020 - momentum: 0.000000
140
+ 2024-03-26 15:32:44,737 ----------------------------------------------------------------------------------------------------
141
+ 2024-03-26 15:32:44,737 EPOCH 4 done: loss 0.1852 - lr: 0.000020
142
+ 2024-03-26 15:32:45,657 DEV : loss 0.22585716843605042 - f1-score (micro avg) 0.8805
143
+ 2024-03-26 15:32:45,658 saving best model
144
+ 2024-03-26 15:32:46,110 ----------------------------------------------------------------------------------------------------
145
+ 2024-03-26 15:32:46,935 epoch 5 - iter 4/48 - loss 0.12116649 - time (sec): 0.82 - samples/sec: 2223.43 - lr: 0.000020 - momentum: 0.000000
146
+ 2024-03-26 15:32:48,298 epoch 5 - iter 8/48 - loss 0.15184731 - time (sec): 2.19 - samples/sec: 2033.71 - lr: 0.000020 - momentum: 0.000000
147
+ 2024-03-26 15:32:51,031 epoch 5 - iter 12/48 - loss 0.15337292 - time (sec): 4.92 - samples/sec: 1621.70 - lr: 0.000019 - momentum: 0.000000
148
+ 2024-03-26 15:32:53,984 epoch 5 - iter 16/48 - loss 0.14647387 - time (sec): 7.87 - samples/sec: 1433.22 - lr: 0.000019 - momentum: 0.000000
149
+ 2024-03-26 15:32:55,373 epoch 5 - iter 20/48 - loss 0.14839734 - time (sec): 9.26 - samples/sec: 1482.03 - lr: 0.000019 - momentum: 0.000000
150
+ 2024-03-26 15:32:57,830 epoch 5 - iter 24/48 - loss 0.14399635 - time (sec): 11.72 - samples/sec: 1429.63 - lr: 0.000018 - momentum: 0.000000
151
+ 2024-03-26 15:32:59,905 epoch 5 - iter 28/48 - loss 0.13921168 - time (sec): 13.79 - samples/sec: 1416.46 - lr: 0.000018 - momentum: 0.000000
152
+ 2024-03-26 15:33:02,186 epoch 5 - iter 32/48 - loss 0.14270024 - time (sec): 16.08 - samples/sec: 1440.86 - lr: 0.000018 - momentum: 0.000000
153
+ 2024-03-26 15:33:03,648 epoch 5 - iter 36/48 - loss 0.14605617 - time (sec): 17.54 - samples/sec: 1464.65 - lr: 0.000018 - momentum: 0.000000
154
+ 2024-03-26 15:33:06,164 epoch 5 - iter 40/48 - loss 0.13970169 - time (sec): 20.05 - samples/sec: 1416.75 - lr: 0.000017 - momentum: 0.000000
155
+ 2024-03-26 15:33:08,231 epoch 5 - iter 44/48 - loss 0.13750957 - time (sec): 22.12 - samples/sec: 1430.14 - lr: 0.000017 - momentum: 0.000000
156
+ 2024-03-26 15:33:10,191 epoch 5 - iter 48/48 - loss 0.13714226 - time (sec): 24.08 - samples/sec: 1431.52 - lr: 0.000017 - momentum: 0.000000
157
+ 2024-03-26 15:33:10,192 ----------------------------------------------------------------------------------------------------
158
+ 2024-03-26 15:33:10,192 EPOCH 5 done: loss 0.1371 - lr: 0.000017
159
+ 2024-03-26 15:33:11,115 DEV : loss 0.20046745240688324 - f1-score (micro avg) 0.8771
160
+ 2024-03-26 15:33:11,116 ----------------------------------------------------------------------------------------------------
161
+ 2024-03-26 15:33:12,676 epoch 6 - iter 4/48 - loss 0.10272289 - time (sec): 1.56 - samples/sec: 1596.25 - lr: 0.000017 - momentum: 0.000000
162
+ 2024-03-26 15:33:15,066 epoch 6 - iter 8/48 - loss 0.09870474 - time (sec): 3.95 - samples/sec: 1620.23 - lr: 0.000016 - momentum: 0.000000
163
+ 2024-03-26 15:33:16,991 epoch 6 - iter 12/48 - loss 0.10078181 - time (sec): 5.87 - samples/sec: 1541.84 - lr: 0.000016 - momentum: 0.000000
164
+ 2024-03-26 15:33:19,000 epoch 6 - iter 16/48 - loss 0.09675325 - time (sec): 7.88 - samples/sec: 1538.21 - lr: 0.000016 - momentum: 0.000000
165
+ 2024-03-26 15:33:21,737 epoch 6 - iter 20/48 - loss 0.09973824 - time (sec): 10.62 - samples/sec: 1504.31 - lr: 0.000015 - momentum: 0.000000
166
+ 2024-03-26 15:33:23,239 epoch 6 - iter 24/48 - loss 0.11510382 - time (sec): 12.12 - samples/sec: 1526.93 - lr: 0.000015 - momentum: 0.000000
167
+ 2024-03-26 15:33:24,605 epoch 6 - iter 28/48 - loss 0.11454603 - time (sec): 13.49 - samples/sec: 1532.30 - lr: 0.000015 - momentum: 0.000000
168
+ 2024-03-26 15:33:25,766 epoch 6 - iter 32/48 - loss 0.11090914 - time (sec): 14.65 - samples/sec: 1552.90 - lr: 0.000015 - momentum: 0.000000
169
+ 2024-03-26 15:33:27,228 epoch 6 - iter 36/48 - loss 0.10548167 - time (sec): 16.11 - samples/sec: 1584.76 - lr: 0.000014 - momentum: 0.000000
170
+ 2024-03-26 15:33:29,120 epoch 6 - iter 40/48 - loss 0.10799176 - time (sec): 18.00 - samples/sec: 1574.06 - lr: 0.000014 - momentum: 0.000000
171
+ 2024-03-26 15:33:31,286 epoch 6 - iter 44/48 - loss 0.10341451 - time (sec): 20.17 - samples/sec: 1594.19 - lr: 0.000014 - momentum: 0.000000
172
+ 2024-03-26 15:33:32,955 epoch 6 - iter 48/48 - loss 0.10237983 - time (sec): 21.84 - samples/sec: 1578.46 - lr: 0.000014 - momentum: 0.000000
173
+ 2024-03-26 15:33:32,956 ----------------------------------------------------------------------------------------------------
174
+ 2024-03-26 15:33:32,956 EPOCH 6 done: loss 0.1024 - lr: 0.000014
175
+ 2024-03-26 15:33:33,865 DEV : loss 0.1799185872077942 - f1-score (micro avg) 0.903
176
+ 2024-03-26 15:33:33,867 saving best model
177
+ 2024-03-26 15:33:34,313 ----------------------------------------------------------------------------------------------------
178
+ 2024-03-26 15:33:35,930 epoch 7 - iter 4/48 - loss 0.07362542 - time (sec): 1.62 - samples/sec: 1506.35 - lr: 0.000013 - momentum: 0.000000
179
+ 2024-03-26 15:33:37,620 epoch 7 - iter 8/48 - loss 0.07695036 - time (sec): 3.31 - samples/sec: 1497.95 - lr: 0.000013 - momentum: 0.000000
180
+ 2024-03-26 15:33:39,724 epoch 7 - iter 12/48 - loss 0.08244330 - time (sec): 5.41 - samples/sec: 1454.46 - lr: 0.000013 - momentum: 0.000000
181
+ 2024-03-26 15:33:41,740 epoch 7 - iter 16/48 - loss 0.08062151 - time (sec): 7.43 - samples/sec: 1500.21 - lr: 0.000012 - momentum: 0.000000
182
+ 2024-03-26 15:33:42,377 epoch 7 - iter 20/48 - loss 0.07768609 - time (sec): 8.06 - samples/sec: 1607.06 - lr: 0.000012 - momentum: 0.000000
183
+ 2024-03-26 15:33:43,958 epoch 7 - iter 24/48 - loss 0.07804374 - time (sec): 9.64 - samples/sec: 1588.68 - lr: 0.000012 - momentum: 0.000000
184
+ 2024-03-26 15:33:46,793 epoch 7 - iter 28/48 - loss 0.07671140 - time (sec): 12.48 - samples/sec: 1492.31 - lr: 0.000012 - momentum: 0.000000
185
+ 2024-03-26 15:33:49,542 epoch 7 - iter 32/48 - loss 0.07602401 - time (sec): 15.23 - samples/sec: 1422.69 - lr: 0.000011 - momentum: 0.000000
186
+ 2024-03-26 15:33:52,262 epoch 7 - iter 36/48 - loss 0.07873676 - time (sec): 17.95 - samples/sec: 1436.35 - lr: 0.000011 - momentum: 0.000000
187
+ 2024-03-26 15:33:54,220 epoch 7 - iter 40/48 - loss 0.08275888 - time (sec): 19.91 - samples/sec: 1444.10 - lr: 0.000011 - momentum: 0.000000
188
+ 2024-03-26 15:33:56,756 epoch 7 - iter 44/48 - loss 0.08223865 - time (sec): 22.44 - samples/sec: 1419.37 - lr: 0.000010 - momentum: 0.000000
189
+ 2024-03-26 15:33:58,492 epoch 7 - iter 48/48 - loss 0.08132377 - time (sec): 24.18 - samples/sec: 1425.72 - lr: 0.000010 - momentum: 0.000000
190
+ 2024-03-26 15:33:58,492 ----------------------------------------------------------------------------------------------------
191
+ 2024-03-26 15:33:58,492 EPOCH 7 done: loss 0.0813 - lr: 0.000010
192
+ 2024-03-26 15:33:59,405 DEV : loss 0.17715860903263092 - f1-score (micro avg) 0.9062
193
+ 2024-03-26 15:33:59,408 saving best model
194
+ 2024-03-26 15:33:59,861 ----------------------------------------------------------------------------------------------------
195
+ 2024-03-26 15:34:02,494 epoch 8 - iter 4/48 - loss 0.07748369 - time (sec): 2.63 - samples/sec: 1255.51 - lr: 0.000010 - momentum: 0.000000
196
+ 2024-03-26 15:34:04,546 epoch 8 - iter 8/48 - loss 0.06084314 - time (sec): 4.68 - samples/sec: 1252.87 - lr: 0.000010 - momentum: 0.000000
197
+ 2024-03-26 15:34:07,704 epoch 8 - iter 12/48 - loss 0.06309502 - time (sec): 7.84 - samples/sec: 1235.93 - lr: 0.000009 - momentum: 0.000000
198
+ 2024-03-26 15:34:09,620 epoch 8 - iter 16/48 - loss 0.07258115 - time (sec): 9.76 - samples/sec: 1265.04 - lr: 0.000009 - momentum: 0.000000
199
+ 2024-03-26 15:34:11,081 epoch 8 - iter 20/48 - loss 0.07019402 - time (sec): 11.22 - samples/sec: 1309.06 - lr: 0.000009 - momentum: 0.000000
200
+ 2024-03-26 15:34:13,491 epoch 8 - iter 24/48 - loss 0.06964750 - time (sec): 13.63 - samples/sec: 1309.36 - lr: 0.000009 - momentum: 0.000000
201
+ 2024-03-26 15:34:15,234 epoch 8 - iter 28/48 - loss 0.07274345 - time (sec): 15.37 - samples/sec: 1345.28 - lr: 0.000008 - momentum: 0.000000
202
+ 2024-03-26 15:34:16,883 epoch 8 - iter 32/48 - loss 0.07155021 - time (sec): 17.02 - samples/sec: 1366.86 - lr: 0.000008 - momentum: 0.000000
203
+ 2024-03-26 15:34:18,168 epoch 8 - iter 36/48 - loss 0.06972937 - time (sec): 18.30 - samples/sec: 1397.60 - lr: 0.000008 - momentum: 0.000000
204
+ 2024-03-26 15:34:20,471 epoch 8 - iter 40/48 - loss 0.07030178 - time (sec): 20.61 - samples/sec: 1406.83 - lr: 0.000007 - momentum: 0.000000
205
+ 2024-03-26 15:34:23,313 epoch 8 - iter 44/48 - loss 0.06695918 - time (sec): 23.45 - samples/sec: 1373.85 - lr: 0.000007 - momentum: 0.000000
206
+ 2024-03-26 15:34:25,225 epoch 8 - iter 48/48 - loss 0.06596868 - time (sec): 25.36 - samples/sec: 1359.21 - lr: 0.000007 - momentum: 0.000000
207
+ 2024-03-26 15:34:25,225 ----------------------------------------------------------------------------------------------------
208
+ 2024-03-26 15:34:25,225 EPOCH 8 done: loss 0.0660 - lr: 0.000007
209
+ 2024-03-26 15:34:26,138 DEV : loss 0.18558232486248016 - f1-score (micro avg) 0.9211
210
+ 2024-03-26 15:34:26,141 saving best model
211
+ 2024-03-26 15:34:26,605 ----------------------------------------------------------------------------------------------------
212
+ 2024-03-26 15:34:28,417 epoch 9 - iter 4/48 - loss 0.06772714 - time (sec): 1.81 - samples/sec: 1570.73 - lr: 0.000007 - momentum: 0.000000
213
+ 2024-03-26 15:34:30,821 epoch 9 - iter 8/48 - loss 0.05495595 - time (sec): 4.21 - samples/sec: 1454.92 - lr: 0.000006 - momentum: 0.000000
214
+ 2024-03-26 15:34:33,163 epoch 9 - iter 12/48 - loss 0.06627844 - time (sec): 6.56 - samples/sec: 1407.75 - lr: 0.000006 - momentum: 0.000000
215
+ 2024-03-26 15:34:35,189 epoch 9 - iter 16/48 - loss 0.06614638 - time (sec): 8.58 - samples/sec: 1409.16 - lr: 0.000006 - momentum: 0.000000
216
+ 2024-03-26 15:34:36,638 epoch 9 - iter 20/48 - loss 0.05859714 - time (sec): 10.03 - samples/sec: 1469.04 - lr: 0.000006 - momentum: 0.000000
217
+ 2024-03-26 15:34:37,839 epoch 9 - iter 24/48 - loss 0.05484876 - time (sec): 11.23 - samples/sec: 1516.66 - lr: 0.000005 - momentum: 0.000000
218
+ 2024-03-26 15:34:39,529 epoch 9 - iter 28/48 - loss 0.05309367 - time (sec): 12.92 - samples/sec: 1530.38 - lr: 0.000005 - momentum: 0.000000
219
+ 2024-03-26 15:34:41,777 epoch 9 - iter 32/48 - loss 0.05768250 - time (sec): 15.17 - samples/sec: 1515.76 - lr: 0.000005 - momentum: 0.000000
220
+ 2024-03-26 15:34:44,442 epoch 9 - iter 36/48 - loss 0.05677322 - time (sec): 17.84 - samples/sec: 1464.59 - lr: 0.000004 - momentum: 0.000000
221
+ 2024-03-26 15:34:47,361 epoch 9 - iter 40/48 - loss 0.05707459 - time (sec): 20.75 - samples/sec: 1420.11 - lr: 0.000004 - momentum: 0.000000
222
+ 2024-03-26 15:34:49,162 epoch 9 - iter 44/48 - loss 0.05624355 - time (sec): 22.56 - samples/sec: 1435.56 - lr: 0.000004 - momentum: 0.000000
223
+ 2024-03-26 15:34:50,192 epoch 9 - iter 48/48 - loss 0.05651383 - time (sec): 23.59 - samples/sec: 1461.56 - lr: 0.000004 - momentum: 0.000000
224
+ 2024-03-26 15:34:50,192 ----------------------------------------------------------------------------------------------------
225
+ 2024-03-26 15:34:50,192 EPOCH 9 done: loss 0.0565 - lr: 0.000004
226
+ 2024-03-26 15:34:51,127 DEV : loss 0.18057239055633545 - f1-score (micro avg) 0.9321
227
+ 2024-03-26 15:34:51,128 saving best model
228
+ 2024-03-26 15:34:51,585 ----------------------------------------------------------------------------------------------------
229
+ 2024-03-26 15:34:53,875 epoch 10 - iter 4/48 - loss 0.02487919 - time (sec): 2.29 - samples/sec: 1442.31 - lr: 0.000003 - momentum: 0.000000
230
+ 2024-03-26 15:34:55,930 epoch 10 - iter 8/48 - loss 0.03646152 - time (sec): 4.34 - samples/sec: 1422.14 - lr: 0.000003 - momentum: 0.000000
231
+ 2024-03-26 15:34:57,846 epoch 10 - iter 12/48 - loss 0.03623072 - time (sec): 6.26 - samples/sec: 1409.39 - lr: 0.000003 - momentum: 0.000000
232
+ 2024-03-26 15:34:59,082 epoch 10 - iter 16/48 - loss 0.03960043 - time (sec): 7.50 - samples/sec: 1470.03 - lr: 0.000002 - momentum: 0.000000
233
+ 2024-03-26 15:35:00,988 epoch 10 - iter 20/48 - loss 0.04632415 - time (sec): 9.40 - samples/sec: 1457.96 - lr: 0.000002 - momentum: 0.000000
234
+ 2024-03-26 15:35:03,197 epoch 10 - iter 24/48 - loss 0.05286228 - time (sec): 11.61 - samples/sec: 1430.23 - lr: 0.000002 - momentum: 0.000000
235
+ 2024-03-26 15:35:04,086 epoch 10 - iter 28/48 - loss 0.05410697 - time (sec): 12.50 - samples/sec: 1503.04 - lr: 0.000002 - momentum: 0.000000
236
+ 2024-03-26 15:35:05,349 epoch 10 - iter 32/48 - loss 0.05294441 - time (sec): 13.76 - samples/sec: 1543.10 - lr: 0.000001 - momentum: 0.000000
237
+ 2024-03-26 15:35:08,108 epoch 10 - iter 36/48 - loss 0.05023736 - time (sec): 16.52 - samples/sec: 1494.52 - lr: 0.000001 - momentum: 0.000000
238
+ 2024-03-26 15:35:10,514 epoch 10 - iter 40/48 - loss 0.05048026 - time (sec): 18.93 - samples/sec: 1519.04 - lr: 0.000001 - momentum: 0.000000
239
+ 2024-03-26 15:35:13,059 epoch 10 - iter 44/48 - loss 0.04935967 - time (sec): 21.47 - samples/sec: 1493.67 - lr: 0.000001 - momentum: 0.000000
240
+ 2024-03-26 15:35:14,979 epoch 10 - iter 48/48 - loss 0.04860975 - time (sec): 23.39 - samples/sec: 1473.54 - lr: 0.000000 - momentum: 0.000000
241
+ 2024-03-26 15:35:14,980 ----------------------------------------------------------------------------------------------------
242
+ 2024-03-26 15:35:14,980 EPOCH 10 done: loss 0.0486 - lr: 0.000000
243
+ 2024-03-26 15:35:15,900 DEV : loss 0.1853199601173401 - f1-score (micro avg) 0.9257
244
+ 2024-03-26 15:35:16,184 ----------------------------------------------------------------------------------------------------
245
+ 2024-03-26 15:35:16,185 Loading model from best epoch ...
246
+ 2024-03-26 15:35:17,059 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
247
+ 2024-03-26 15:35:17,909
248
+ Results:
249
+ - F-score (micro) 0.8995
250
+ - F-score (macro) 0.6839
251
+ - Accuracy 0.8208
252
+
253
+ By class:
254
+ precision recall f1-score support
255
+
256
+ Unternehmen 0.9008 0.8872 0.8939 266
257
+ Auslagerung 0.8479 0.8956 0.8711 249
258
+ Ort 0.9565 0.9851 0.9706 134
259
+ Software 0.0000 0.0000 0.0000 0
260
+
261
+ micro avg 0.8887 0.9106 0.8995 649
262
+ macro avg 0.6763 0.6920 0.6839 649
263
+ weighted avg 0.8920 0.9106 0.9010 649
264
+
265
+ 2024-03-26 15:35:17,909 ----------------------------------------------------------------------------------------------------