stefan-it commited on
Commit
db5ada2
1 Parent(s): e412e16

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +246 -0
training.log ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-03-26 10:24:52,166 ----------------------------------------------------------------------------------------------------
2
+ 2024-03-26 10:24:52,166 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(31103, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2024-03-26 10:24:52,166 ----------------------------------------------------------------------------------------------------
51
+ 2024-03-26 10:24:52,166 Corpus: 758 train + 94 dev + 96 test sentences
52
+ 2024-03-26 10:24:52,166 ----------------------------------------------------------------------------------------------------
53
+ 2024-03-26 10:24:52,166 Train: 758 sentences
54
+ 2024-03-26 10:24:52,166 (train_with_dev=False, train_with_test=False)
55
+ 2024-03-26 10:24:52,166 ----------------------------------------------------------------------------------------------------
56
+ 2024-03-26 10:24:52,166 Training Params:
57
+ 2024-03-26 10:24:52,166 - learning_rate: "3e-05"
58
+ 2024-03-26 10:24:52,166 - mini_batch_size: "8"
59
+ 2024-03-26 10:24:52,166 - max_epochs: "10"
60
+ 2024-03-26 10:24:52,166 - shuffle: "True"
61
+ 2024-03-26 10:24:52,166 ----------------------------------------------------------------------------------------------------
62
+ 2024-03-26 10:24:52,166 Plugins:
63
+ 2024-03-26 10:24:52,166 - TensorboardLogger
64
+ 2024-03-26 10:24:52,166 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2024-03-26 10:24:52,166 ----------------------------------------------------------------------------------------------------
66
+ 2024-03-26 10:24:52,166 Final evaluation on model from best epoch (best-model.pt)
67
+ 2024-03-26 10:24:52,166 - metric: "('micro avg', 'f1-score')"
68
+ 2024-03-26 10:24:52,166 ----------------------------------------------------------------------------------------------------
69
+ 2024-03-26 10:24:52,167 Computation:
70
+ 2024-03-26 10:24:52,167 - compute on device: cuda:0
71
+ 2024-03-26 10:24:52,167 - embedding storage: none
72
+ 2024-03-26 10:24:52,167 ----------------------------------------------------------------------------------------------------
73
+ 2024-03-26 10:24:52,167 Model training base path: "flair-co-funer-gbert_base-bs8-e10-lr3e-05-4"
74
+ 2024-03-26 10:24:52,167 ----------------------------------------------------------------------------------------------------
75
+ 2024-03-26 10:24:52,167 ----------------------------------------------------------------------------------------------------
76
+ 2024-03-26 10:24:52,167 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2024-03-26 10:24:53,520 epoch 1 - iter 9/95 - loss 3.31156291 - time (sec): 1.35 - samples/sec: 2144.74 - lr: 0.000003 - momentum: 0.000000
78
+ 2024-03-26 10:24:54,905 epoch 1 - iter 18/95 - loss 3.18392069 - time (sec): 2.74 - samples/sec: 2014.03 - lr: 0.000005 - momentum: 0.000000
79
+ 2024-03-26 10:24:56,522 epoch 1 - iter 27/95 - loss 2.95366883 - time (sec): 4.36 - samples/sec: 1969.00 - lr: 0.000008 - momentum: 0.000000
80
+ 2024-03-26 10:24:58,491 epoch 1 - iter 36/95 - loss 2.72938282 - time (sec): 6.32 - samples/sec: 1887.79 - lr: 0.000011 - momentum: 0.000000
81
+ 2024-03-26 10:25:00,336 epoch 1 - iter 45/95 - loss 2.51840902 - time (sec): 8.17 - samples/sec: 1908.99 - lr: 0.000014 - momentum: 0.000000
82
+ 2024-03-26 10:25:02,554 epoch 1 - iter 54/95 - loss 2.37628283 - time (sec): 10.39 - samples/sec: 1843.30 - lr: 0.000017 - momentum: 0.000000
83
+ 2024-03-26 10:25:04,557 epoch 1 - iter 63/95 - loss 2.24029015 - time (sec): 12.39 - samples/sec: 1824.17 - lr: 0.000020 - momentum: 0.000000
84
+ 2024-03-26 10:25:05,526 epoch 1 - iter 72/95 - loss 2.15411890 - time (sec): 13.36 - samples/sec: 1869.87 - lr: 0.000022 - momentum: 0.000000
85
+ 2024-03-26 10:25:07,799 epoch 1 - iter 81/95 - loss 2.02104777 - time (sec): 15.63 - samples/sec: 1817.14 - lr: 0.000025 - momentum: 0.000000
86
+ 2024-03-26 10:25:09,119 epoch 1 - iter 90/95 - loss 1.88999619 - time (sec): 16.95 - samples/sec: 1885.32 - lr: 0.000028 - momentum: 0.000000
87
+ 2024-03-26 10:25:10,378 ----------------------------------------------------------------------------------------------------
88
+ 2024-03-26 10:25:10,379 EPOCH 1 done: loss 1.8224 - lr: 0.000028
89
+ 2024-03-26 10:25:11,332 DEV : loss 0.5509695410728455 - f1-score (micro avg) 0.6394
90
+ 2024-03-26 10:25:11,333 saving best model
91
+ 2024-03-26 10:25:11,615 ----------------------------------------------------------------------------------------------------
92
+ 2024-03-26 10:25:13,192 epoch 2 - iter 9/95 - loss 0.74076124 - time (sec): 1.58 - samples/sec: 1827.24 - lr: 0.000030 - momentum: 0.000000
93
+ 2024-03-26 10:25:14,827 epoch 2 - iter 18/95 - loss 0.64312417 - time (sec): 3.21 - samples/sec: 1924.49 - lr: 0.000029 - momentum: 0.000000
94
+ 2024-03-26 10:25:16,603 epoch 2 - iter 27/95 - loss 0.59053178 - time (sec): 4.99 - samples/sec: 1895.84 - lr: 0.000029 - momentum: 0.000000
95
+ 2024-03-26 10:25:18,977 epoch 2 - iter 36/95 - loss 0.52064870 - time (sec): 7.36 - samples/sec: 1772.23 - lr: 0.000029 - momentum: 0.000000
96
+ 2024-03-26 10:25:20,940 epoch 2 - iter 45/95 - loss 0.49481256 - time (sec): 9.32 - samples/sec: 1769.45 - lr: 0.000028 - momentum: 0.000000
97
+ 2024-03-26 10:25:22,693 epoch 2 - iter 54/95 - loss 0.49494445 - time (sec): 11.08 - samples/sec: 1790.39 - lr: 0.000028 - momentum: 0.000000
98
+ 2024-03-26 10:25:25,097 epoch 2 - iter 63/95 - loss 0.46932616 - time (sec): 13.48 - samples/sec: 1773.50 - lr: 0.000028 - momentum: 0.000000
99
+ 2024-03-26 10:25:26,915 epoch 2 - iter 72/95 - loss 0.46316340 - time (sec): 15.30 - samples/sec: 1767.50 - lr: 0.000028 - momentum: 0.000000
100
+ 2024-03-26 10:25:29,086 epoch 2 - iter 81/95 - loss 0.45228059 - time (sec): 17.47 - samples/sec: 1751.09 - lr: 0.000027 - momentum: 0.000000
101
+ 2024-03-26 10:25:30,368 epoch 2 - iter 90/95 - loss 0.44481235 - time (sec): 18.75 - samples/sec: 1776.61 - lr: 0.000027 - momentum: 0.000000
102
+ 2024-03-26 10:25:30,810 ----------------------------------------------------------------------------------------------------
103
+ 2024-03-26 10:25:30,810 EPOCH 2 done: loss 0.4387 - lr: 0.000027
104
+ 2024-03-26 10:25:31,701 DEV : loss 0.27566346526145935 - f1-score (micro avg) 0.8453
105
+ 2024-03-26 10:25:31,704 saving best model
106
+ 2024-03-26 10:25:32,186 ----------------------------------------------------------------------------------------------------
107
+ 2024-03-26 10:25:33,679 epoch 3 - iter 9/95 - loss 0.28125940 - time (sec): 1.49 - samples/sec: 1776.87 - lr: 0.000026 - momentum: 0.000000
108
+ 2024-03-26 10:25:35,396 epoch 3 - iter 18/95 - loss 0.24015894 - time (sec): 3.21 - samples/sec: 1742.05 - lr: 0.000026 - momentum: 0.000000
109
+ 2024-03-26 10:25:37,224 epoch 3 - iter 27/95 - loss 0.23445075 - time (sec): 5.04 - samples/sec: 1768.16 - lr: 0.000026 - momentum: 0.000000
110
+ 2024-03-26 10:25:38,998 epoch 3 - iter 36/95 - loss 0.23589931 - time (sec): 6.81 - samples/sec: 1772.37 - lr: 0.000025 - momentum: 0.000000
111
+ 2024-03-26 10:25:40,991 epoch 3 - iter 45/95 - loss 0.23131598 - time (sec): 8.80 - samples/sec: 1793.43 - lr: 0.000025 - momentum: 0.000000
112
+ 2024-03-26 10:25:43,209 epoch 3 - iter 54/95 - loss 0.22372789 - time (sec): 11.02 - samples/sec: 1757.31 - lr: 0.000025 - momentum: 0.000000
113
+ 2024-03-26 10:25:44,900 epoch 3 - iter 63/95 - loss 0.21970639 - time (sec): 12.71 - samples/sec: 1759.73 - lr: 0.000025 - momentum: 0.000000
114
+ 2024-03-26 10:25:46,852 epoch 3 - iter 72/95 - loss 0.21660214 - time (sec): 14.66 - samples/sec: 1765.29 - lr: 0.000024 - momentum: 0.000000
115
+ 2024-03-26 10:25:48,807 epoch 3 - iter 81/95 - loss 0.22201067 - time (sec): 16.62 - samples/sec: 1782.08 - lr: 0.000024 - momentum: 0.000000
116
+ 2024-03-26 10:25:51,025 epoch 3 - iter 90/95 - loss 0.21545506 - time (sec): 18.84 - samples/sec: 1760.47 - lr: 0.000024 - momentum: 0.000000
117
+ 2024-03-26 10:25:51,637 ----------------------------------------------------------------------------------------------------
118
+ 2024-03-26 10:25:51,638 EPOCH 3 done: loss 0.2183 - lr: 0.000024
119
+ 2024-03-26 10:25:52,544 DEV : loss 0.20986029505729675 - f1-score (micro avg) 0.8793
120
+ 2024-03-26 10:25:52,546 saving best model
121
+ 2024-03-26 10:25:52,986 ----------------------------------------------------------------------------------------------------
122
+ 2024-03-26 10:25:55,360 epoch 4 - iter 9/95 - loss 0.10059051 - time (sec): 2.37 - samples/sec: 1667.65 - lr: 0.000023 - momentum: 0.000000
123
+ 2024-03-26 10:25:56,506 epoch 4 - iter 18/95 - loss 0.12160418 - time (sec): 3.52 - samples/sec: 1819.41 - lr: 0.000023 - momentum: 0.000000
124
+ 2024-03-26 10:25:58,610 epoch 4 - iter 27/95 - loss 0.13599375 - time (sec): 5.62 - samples/sec: 1849.65 - lr: 0.000022 - momentum: 0.000000
125
+ 2024-03-26 10:26:00,074 epoch 4 - iter 36/95 - loss 0.13862392 - time (sec): 7.09 - samples/sec: 1882.76 - lr: 0.000022 - momentum: 0.000000
126
+ 2024-03-26 10:26:01,369 epoch 4 - iter 45/95 - loss 0.13926567 - time (sec): 8.38 - samples/sec: 1914.79 - lr: 0.000022 - momentum: 0.000000
127
+ 2024-03-26 10:26:03,402 epoch 4 - iter 54/95 - loss 0.13469600 - time (sec): 10.41 - samples/sec: 1858.54 - lr: 0.000022 - momentum: 0.000000
128
+ 2024-03-26 10:26:05,642 epoch 4 - iter 63/95 - loss 0.14272038 - time (sec): 12.65 - samples/sec: 1829.66 - lr: 0.000021 - momentum: 0.000000
129
+ 2024-03-26 10:26:07,074 epoch 4 - iter 72/95 - loss 0.14102890 - time (sec): 14.09 - samples/sec: 1865.50 - lr: 0.000021 - momentum: 0.000000
130
+ 2024-03-26 10:26:08,646 epoch 4 - iter 81/95 - loss 0.13827607 - time (sec): 15.66 - samples/sec: 1896.84 - lr: 0.000021 - momentum: 0.000000
131
+ 2024-03-26 10:26:10,235 epoch 4 - iter 90/95 - loss 0.13680899 - time (sec): 17.25 - samples/sec: 1925.34 - lr: 0.000020 - momentum: 0.000000
132
+ 2024-03-26 10:26:10,854 ----------------------------------------------------------------------------------------------------
133
+ 2024-03-26 10:26:10,854 EPOCH 4 done: loss 0.1367 - lr: 0.000020
134
+ 2024-03-26 10:26:11,748 DEV : loss 0.20099645853042603 - f1-score (micro avg) 0.8814
135
+ 2024-03-26 10:26:11,749 saving best model
136
+ 2024-03-26 10:26:12,202 ----------------------------------------------------------------------------------------------------
137
+ 2024-03-26 10:26:13,391 epoch 5 - iter 9/95 - loss 0.14854127 - time (sec): 1.19 - samples/sec: 2490.69 - lr: 0.000020 - momentum: 0.000000
138
+ 2024-03-26 10:26:14,805 epoch 5 - iter 18/95 - loss 0.13874171 - time (sec): 2.60 - samples/sec: 2240.53 - lr: 0.000019 - momentum: 0.000000
139
+ 2024-03-26 10:26:16,759 epoch 5 - iter 27/95 - loss 0.12505013 - time (sec): 4.56 - samples/sec: 2018.87 - lr: 0.000019 - momentum: 0.000000
140
+ 2024-03-26 10:26:19,148 epoch 5 - iter 36/95 - loss 0.11891017 - time (sec): 6.94 - samples/sec: 1828.85 - lr: 0.000019 - momentum: 0.000000
141
+ 2024-03-26 10:26:20,348 epoch 5 - iter 45/95 - loss 0.12166274 - time (sec): 8.14 - samples/sec: 1875.26 - lr: 0.000019 - momentum: 0.000000
142
+ 2024-03-26 10:26:22,184 epoch 5 - iter 54/95 - loss 0.11393605 - time (sec): 9.98 - samples/sec: 1917.94 - lr: 0.000018 - momentum: 0.000000
143
+ 2024-03-26 10:26:24,209 epoch 5 - iter 63/95 - loss 0.10482095 - time (sec): 12.01 - samples/sec: 1903.57 - lr: 0.000018 - momentum: 0.000000
144
+ 2024-03-26 10:26:25,458 epoch 5 - iter 72/95 - loss 0.10370554 - time (sec): 13.25 - samples/sec: 1931.40 - lr: 0.000018 - momentum: 0.000000
145
+ 2024-03-26 10:26:27,975 epoch 5 - iter 81/95 - loss 0.09741425 - time (sec): 15.77 - samples/sec: 1862.26 - lr: 0.000017 - momentum: 0.000000
146
+ 2024-03-26 10:26:29,980 epoch 5 - iter 90/95 - loss 0.09710361 - time (sec): 17.78 - samples/sec: 1841.82 - lr: 0.000017 - momentum: 0.000000
147
+ 2024-03-26 10:26:30,838 ----------------------------------------------------------------------------------------------------
148
+ 2024-03-26 10:26:30,838 EPOCH 5 done: loss 0.0989 - lr: 0.000017
149
+ 2024-03-26 10:26:31,822 DEV : loss 0.1644967943429947 - f1-score (micro avg) 0.9117
150
+ 2024-03-26 10:26:31,823 saving best model
151
+ 2024-03-26 10:26:32,270 ----------------------------------------------------------------------------------------------------
152
+ 2024-03-26 10:26:33,920 epoch 6 - iter 9/95 - loss 0.10413120 - time (sec): 1.65 - samples/sec: 2011.85 - lr: 0.000016 - momentum: 0.000000
153
+ 2024-03-26 10:26:35,965 epoch 6 - iter 18/95 - loss 0.08089059 - time (sec): 3.69 - samples/sec: 1835.17 - lr: 0.000016 - momentum: 0.000000
154
+ 2024-03-26 10:26:37,392 epoch 6 - iter 27/95 - loss 0.08805997 - time (sec): 5.12 - samples/sec: 1866.44 - lr: 0.000016 - momentum: 0.000000
155
+ 2024-03-26 10:26:39,706 epoch 6 - iter 36/95 - loss 0.07261456 - time (sec): 7.43 - samples/sec: 1727.43 - lr: 0.000016 - momentum: 0.000000
156
+ 2024-03-26 10:26:41,473 epoch 6 - iter 45/95 - loss 0.06884988 - time (sec): 9.20 - samples/sec: 1748.39 - lr: 0.000015 - momentum: 0.000000
157
+ 2024-03-26 10:26:43,942 epoch 6 - iter 54/95 - loss 0.07558854 - time (sec): 11.67 - samples/sec: 1725.16 - lr: 0.000015 - momentum: 0.000000
158
+ 2024-03-26 10:26:45,446 epoch 6 - iter 63/95 - loss 0.07659241 - time (sec): 13.17 - samples/sec: 1742.49 - lr: 0.000015 - momentum: 0.000000
159
+ 2024-03-26 10:26:46,973 epoch 6 - iter 72/95 - loss 0.07788665 - time (sec): 14.70 - samples/sec: 1764.59 - lr: 0.000014 - momentum: 0.000000
160
+ 2024-03-26 10:26:49,046 epoch 6 - iter 81/95 - loss 0.07804266 - time (sec): 16.77 - samples/sec: 1757.52 - lr: 0.000014 - momentum: 0.000000
161
+ 2024-03-26 10:26:50,216 epoch 6 - iter 90/95 - loss 0.08145631 - time (sec): 17.94 - samples/sec: 1802.53 - lr: 0.000014 - momentum: 0.000000
162
+ 2024-03-26 10:26:51,543 ----------------------------------------------------------------------------------------------------
163
+ 2024-03-26 10:26:51,543 EPOCH 6 done: loss 0.0800 - lr: 0.000014
164
+ 2024-03-26 10:26:52,437 DEV : loss 0.16239121556282043 - f1-score (micro avg) 0.9151
165
+ 2024-03-26 10:26:52,438 saving best model
166
+ 2024-03-26 10:26:52,911 ----------------------------------------------------------------------------------------------------
167
+ 2024-03-26 10:26:54,266 epoch 7 - iter 9/95 - loss 0.05622469 - time (sec): 1.35 - samples/sec: 2343.97 - lr: 0.000013 - momentum: 0.000000
168
+ 2024-03-26 10:26:56,378 epoch 7 - iter 18/95 - loss 0.05406749 - time (sec): 3.47 - samples/sec: 1945.67 - lr: 0.000013 - momentum: 0.000000
169
+ 2024-03-26 10:26:58,265 epoch 7 - iter 27/95 - loss 0.06078903 - time (sec): 5.35 - samples/sec: 1826.18 - lr: 0.000013 - momentum: 0.000000
170
+ 2024-03-26 10:26:59,548 epoch 7 - iter 36/95 - loss 0.05888607 - time (sec): 6.64 - samples/sec: 1885.06 - lr: 0.000012 - momentum: 0.000000
171
+ 2024-03-26 10:27:01,230 epoch 7 - iter 45/95 - loss 0.05971791 - time (sec): 8.32 - samples/sec: 1891.63 - lr: 0.000012 - momentum: 0.000000
172
+ 2024-03-26 10:27:03,423 epoch 7 - iter 54/95 - loss 0.05716422 - time (sec): 10.51 - samples/sec: 1864.14 - lr: 0.000012 - momentum: 0.000000
173
+ 2024-03-26 10:27:05,458 epoch 7 - iter 63/95 - loss 0.05760654 - time (sec): 12.55 - samples/sec: 1814.82 - lr: 0.000011 - momentum: 0.000000
174
+ 2024-03-26 10:27:07,579 epoch 7 - iter 72/95 - loss 0.05620365 - time (sec): 14.67 - samples/sec: 1789.21 - lr: 0.000011 - momentum: 0.000000
175
+ 2024-03-26 10:27:09,084 epoch 7 - iter 81/95 - loss 0.06064974 - time (sec): 16.17 - samples/sec: 1795.78 - lr: 0.000011 - momentum: 0.000000
176
+ 2024-03-26 10:27:10,952 epoch 7 - iter 90/95 - loss 0.06445434 - time (sec): 18.04 - samples/sec: 1823.26 - lr: 0.000010 - momentum: 0.000000
177
+ 2024-03-26 10:27:11,633 ----------------------------------------------------------------------------------------------------
178
+ 2024-03-26 10:27:11,633 EPOCH 7 done: loss 0.0640 - lr: 0.000010
179
+ 2024-03-26 10:27:12,537 DEV : loss 0.15409202873706818 - f1-score (micro avg) 0.9208
180
+ 2024-03-26 10:27:12,538 saving best model
181
+ 2024-03-26 10:27:12,998 ----------------------------------------------------------------------------------------------------
182
+ 2024-03-26 10:27:14,640 epoch 8 - iter 9/95 - loss 0.02608247 - time (sec): 1.64 - samples/sec: 1793.80 - lr: 0.000010 - momentum: 0.000000
183
+ 2024-03-26 10:27:16,763 epoch 8 - iter 18/95 - loss 0.03144674 - time (sec): 3.76 - samples/sec: 1761.66 - lr: 0.000010 - momentum: 0.000000
184
+ 2024-03-26 10:27:18,610 epoch 8 - iter 27/95 - loss 0.04433786 - time (sec): 5.61 - samples/sec: 1729.74 - lr: 0.000009 - momentum: 0.000000
185
+ 2024-03-26 10:27:20,578 epoch 8 - iter 36/95 - loss 0.04263690 - time (sec): 7.58 - samples/sec: 1738.07 - lr: 0.000009 - momentum: 0.000000
186
+ 2024-03-26 10:27:21,601 epoch 8 - iter 45/95 - loss 0.04899586 - time (sec): 8.60 - samples/sec: 1823.91 - lr: 0.000009 - momentum: 0.000000
187
+ 2024-03-26 10:27:23,528 epoch 8 - iter 54/95 - loss 0.04889158 - time (sec): 10.53 - samples/sec: 1814.37 - lr: 0.000008 - momentum: 0.000000
188
+ 2024-03-26 10:27:25,748 epoch 8 - iter 63/95 - loss 0.05325093 - time (sec): 12.75 - samples/sec: 1797.74 - lr: 0.000008 - momentum: 0.000000
189
+ 2024-03-26 10:27:27,949 epoch 8 - iter 72/95 - loss 0.05321026 - time (sec): 14.95 - samples/sec: 1788.16 - lr: 0.000008 - momentum: 0.000000
190
+ 2024-03-26 10:27:29,646 epoch 8 - iter 81/95 - loss 0.05236218 - time (sec): 16.65 - samples/sec: 1792.92 - lr: 0.000007 - momentum: 0.000000
191
+ 2024-03-26 10:27:31,568 epoch 8 - iter 90/95 - loss 0.04940129 - time (sec): 18.57 - samples/sec: 1788.03 - lr: 0.000007 - momentum: 0.000000
192
+ 2024-03-26 10:27:32,166 ----------------------------------------------------------------------------------------------------
193
+ 2024-03-26 10:27:32,166 EPOCH 8 done: loss 0.0496 - lr: 0.000007
194
+ 2024-03-26 10:27:33,063 DEV : loss 0.16207414865493774 - f1-score (micro avg) 0.9223
195
+ 2024-03-26 10:27:33,064 saving best model
196
+ 2024-03-26 10:27:33,538 ----------------------------------------------------------------------------------------------------
197
+ 2024-03-26 10:27:35,061 epoch 9 - iter 9/95 - loss 0.04824836 - time (sec): 1.52 - samples/sec: 2091.15 - lr: 0.000007 - momentum: 0.000000
198
+ 2024-03-26 10:27:37,352 epoch 9 - iter 18/95 - loss 0.04164786 - time (sec): 3.81 - samples/sec: 1786.39 - lr: 0.000006 - momentum: 0.000000
199
+ 2024-03-26 10:27:38,944 epoch 9 - iter 27/95 - loss 0.03450957 - time (sec): 5.40 - samples/sec: 1803.48 - lr: 0.000006 - momentum: 0.000000
200
+ 2024-03-26 10:27:41,246 epoch 9 - iter 36/95 - loss 0.03847090 - time (sec): 7.71 - samples/sec: 1760.67 - lr: 0.000006 - momentum: 0.000000
201
+ 2024-03-26 10:27:43,135 epoch 9 - iter 45/95 - loss 0.03682640 - time (sec): 9.59 - samples/sec: 1734.04 - lr: 0.000005 - momentum: 0.000000
202
+ 2024-03-26 10:27:44,497 epoch 9 - iter 54/95 - loss 0.04046342 - time (sec): 10.96 - samples/sec: 1780.92 - lr: 0.000005 - momentum: 0.000000
203
+ 2024-03-26 10:27:46,572 epoch 9 - iter 63/95 - loss 0.03871736 - time (sec): 13.03 - samples/sec: 1759.33 - lr: 0.000005 - momentum: 0.000000
204
+ 2024-03-26 10:27:47,792 epoch 9 - iter 72/95 - loss 0.04374224 - time (sec): 14.25 - samples/sec: 1793.60 - lr: 0.000004 - momentum: 0.000000
205
+ 2024-03-26 10:27:50,554 epoch 9 - iter 81/95 - loss 0.04236482 - time (sec): 17.01 - samples/sec: 1745.39 - lr: 0.000004 - momentum: 0.000000
206
+ 2024-03-26 10:27:52,199 epoch 9 - iter 90/95 - loss 0.04069339 - time (sec): 18.66 - samples/sec: 1765.59 - lr: 0.000004 - momentum: 0.000000
207
+ 2024-03-26 10:27:52,862 ----------------------------------------------------------------------------------------------------
208
+ 2024-03-26 10:27:52,862 EPOCH 9 done: loss 0.0423 - lr: 0.000004
209
+ 2024-03-26 10:27:53,757 DEV : loss 0.1653764247894287 - f1-score (micro avg) 0.9211
210
+ 2024-03-26 10:27:53,759 ----------------------------------------------------------------------------------------------------
211
+ 2024-03-26 10:27:55,594 epoch 10 - iter 9/95 - loss 0.04135826 - time (sec): 1.84 - samples/sec: 1689.72 - lr: 0.000003 - momentum: 0.000000
212
+ 2024-03-26 10:27:57,850 epoch 10 - iter 18/95 - loss 0.04412331 - time (sec): 4.09 - samples/sec: 1629.02 - lr: 0.000003 - momentum: 0.000000
213
+ 2024-03-26 10:27:59,259 epoch 10 - iter 27/95 - loss 0.03803873 - time (sec): 5.50 - samples/sec: 1783.48 - lr: 0.000003 - momentum: 0.000000
214
+ 2024-03-26 10:28:00,977 epoch 10 - iter 36/95 - loss 0.03579299 - time (sec): 7.22 - samples/sec: 1828.24 - lr: 0.000002 - momentum: 0.000000
215
+ 2024-03-26 10:28:02,390 epoch 10 - iter 45/95 - loss 0.03456453 - time (sec): 8.63 - samples/sec: 1860.61 - lr: 0.000002 - momentum: 0.000000
216
+ 2024-03-26 10:28:03,413 epoch 10 - iter 54/95 - loss 0.03349836 - time (sec): 9.65 - samples/sec: 1932.99 - lr: 0.000002 - momentum: 0.000000
217
+ 2024-03-26 10:28:05,218 epoch 10 - iter 63/95 - loss 0.03079665 - time (sec): 11.46 - samples/sec: 1907.53 - lr: 0.000001 - momentum: 0.000000
218
+ 2024-03-26 10:28:07,472 epoch 10 - iter 72/95 - loss 0.03522809 - time (sec): 13.71 - samples/sec: 1859.15 - lr: 0.000001 - momentum: 0.000000
219
+ 2024-03-26 10:28:09,104 epoch 10 - iter 81/95 - loss 0.03835410 - time (sec): 15.35 - samples/sec: 1851.59 - lr: 0.000001 - momentum: 0.000000
220
+ 2024-03-26 10:28:11,405 epoch 10 - iter 90/95 - loss 0.03677696 - time (sec): 17.65 - samples/sec: 1841.74 - lr: 0.000000 - momentum: 0.000000
221
+ 2024-03-26 10:28:12,644 ----------------------------------------------------------------------------------------------------
222
+ 2024-03-26 10:28:12,644 EPOCH 10 done: loss 0.0370 - lr: 0.000000
223
+ 2024-03-26 10:28:13,542 DEV : loss 0.16609874367713928 - f1-score (micro avg) 0.9295
224
+ 2024-03-26 10:28:13,543 saving best model
225
+ 2024-03-26 10:28:14,275 ----------------------------------------------------------------------------------------------------
226
+ 2024-03-26 10:28:14,275 Loading model from best epoch ...
227
+ 2024-03-26 10:28:15,185 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
228
+ 2024-03-26 10:28:15,938
229
+ Results:
230
+ - F-score (micro) 0.9085
231
+ - F-score (macro) 0.6896
232
+ - Accuracy 0.8371
233
+
234
+ By class:
235
+ precision recall f1-score support
236
+
237
+ Unternehmen 0.9360 0.8797 0.9070 266
238
+ Auslagerung 0.8561 0.9076 0.8811 249
239
+ Ort 0.9632 0.9776 0.9704 134
240
+ Software 0.0000 0.0000 0.0000 0
241
+
242
+ micro avg 0.9064 0.9106 0.9085 649
243
+ macro avg 0.6888 0.6912 0.6896 649
244
+ weighted avg 0.9110 0.9106 0.9101 649
245
+
246
+ 2024-03-26 10:28:15,938 ----------------------------------------------------------------------------------------------------