stefan-it commited on
Commit
5518485
1 Parent(s): f89053e

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +246 -0
training.log ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
2
+ 2024-03-26 15:23:48,594 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(31103, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
51
+ 2024-03-26 15:23:48,594 Corpus: 758 train + 94 dev + 96 test sentences
52
+ 2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
53
+ 2024-03-26 15:23:48,594 Train: 758 sentences
54
+ 2024-03-26 15:23:48,594 (train_with_dev=False, train_with_test=False)
55
+ 2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
56
+ 2024-03-26 15:23:48,594 Training Params:
57
+ 2024-03-26 15:23:48,594 - learning_rate: "3e-05"
58
+ 2024-03-26 15:23:48,594 - mini_batch_size: "8"
59
+ 2024-03-26 15:23:48,594 - max_epochs: "10"
60
+ 2024-03-26 15:23:48,594 - shuffle: "True"
61
+ 2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
62
+ 2024-03-26 15:23:48,594 Plugins:
63
+ 2024-03-26 15:23:48,594 - TensorboardLogger
64
+ 2024-03-26 15:23:48,594 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2024-03-26 15:23:48,594 ----------------------------------------------------------------------------------------------------
66
+ 2024-03-26 15:23:48,595 Final evaluation on model from best epoch (best-model.pt)
67
+ 2024-03-26 15:23:48,595 - metric: "('micro avg', 'f1-score')"
68
+ 2024-03-26 15:23:48,595 ----------------------------------------------------------------------------------------------------
69
+ 2024-03-26 15:23:48,595 Computation:
70
+ 2024-03-26 15:23:48,595 - compute on device: cuda:0
71
+ 2024-03-26 15:23:48,595 - embedding storage: none
72
+ 2024-03-26 15:23:48,595 ----------------------------------------------------------------------------------------------------
73
+ 2024-03-26 15:23:48,595 Model training base path: "flair-co-funer-german_dbmdz_bert_base-bs8-e10-lr3e-05-1"
74
+ 2024-03-26 15:23:48,595 ----------------------------------------------------------------------------------------------------
75
+ 2024-03-26 15:23:48,595 ----------------------------------------------------------------------------------------------------
76
+ 2024-03-26 15:23:48,595 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2024-03-26 15:23:50,174 epoch 1 - iter 9/95 - loss 3.14979084 - time (sec): 1.58 - samples/sec: 1949.86 - lr: 0.000003 - momentum: 0.000000
78
+ 2024-03-26 15:23:51,703 epoch 1 - iter 18/95 - loss 3.04307356 - time (sec): 3.11 - samples/sec: 2011.38 - lr: 0.000005 - momentum: 0.000000
79
+ 2024-03-26 15:23:54,095 epoch 1 - iter 27/95 - loss 2.85325540 - time (sec): 5.50 - samples/sec: 1861.76 - lr: 0.000008 - momentum: 0.000000
80
+ 2024-03-26 15:23:56,319 epoch 1 - iter 36/95 - loss 2.65209783 - time (sec): 7.72 - samples/sec: 1809.88 - lr: 0.000011 - momentum: 0.000000
81
+ 2024-03-26 15:23:58,209 epoch 1 - iter 45/95 - loss 2.47670116 - time (sec): 9.61 - samples/sec: 1816.34 - lr: 0.000014 - momentum: 0.000000
82
+ 2024-03-26 15:23:59,433 epoch 1 - iter 54/95 - loss 2.34499787 - time (sec): 10.84 - samples/sec: 1858.15 - lr: 0.000017 - momentum: 0.000000
83
+ 2024-03-26 15:24:01,141 epoch 1 - iter 63/95 - loss 2.21603300 - time (sec): 12.55 - samples/sec: 1854.34 - lr: 0.000020 - momentum: 0.000000
84
+ 2024-03-26 15:24:02,426 epoch 1 - iter 72/95 - loss 2.11033026 - time (sec): 13.83 - samples/sec: 1883.35 - lr: 0.000022 - momentum: 0.000000
85
+ 2024-03-26 15:24:04,396 epoch 1 - iter 81/95 - loss 1.97356414 - time (sec): 15.80 - samples/sec: 1874.46 - lr: 0.000025 - momentum: 0.000000
86
+ 2024-03-26 15:24:05,710 epoch 1 - iter 90/95 - loss 1.87082078 - time (sec): 17.12 - samples/sec: 1895.41 - lr: 0.000028 - momentum: 0.000000
87
+ 2024-03-26 15:24:06,924 ----------------------------------------------------------------------------------------------------
88
+ 2024-03-26 15:24:06,925 EPOCH 1 done: loss 1.7919 - lr: 0.000028
89
+ 2024-03-26 15:24:07,760 DEV : loss 0.5088863968849182 - f1-score (micro avg) 0.6482
90
+ 2024-03-26 15:24:07,761 saving best model
91
+ 2024-03-26 15:24:08,026 ----------------------------------------------------------------------------------------------------
92
+ 2024-03-26 15:24:10,065 epoch 2 - iter 9/95 - loss 0.50823411 - time (sec): 2.04 - samples/sec: 1811.64 - lr: 0.000030 - momentum: 0.000000
93
+ 2024-03-26 15:24:11,741 epoch 2 - iter 18/95 - loss 0.53694131 - time (sec): 3.71 - samples/sec: 1953.10 - lr: 0.000029 - momentum: 0.000000
94
+ 2024-03-26 15:24:13,548 epoch 2 - iter 27/95 - loss 0.50533491 - time (sec): 5.52 - samples/sec: 1867.09 - lr: 0.000029 - momentum: 0.000000
95
+ 2024-03-26 15:24:15,312 epoch 2 - iter 36/95 - loss 0.48259154 - time (sec): 7.29 - samples/sec: 1835.30 - lr: 0.000029 - momentum: 0.000000
96
+ 2024-03-26 15:24:17,204 epoch 2 - iter 45/95 - loss 0.45474310 - time (sec): 9.18 - samples/sec: 1845.27 - lr: 0.000028 - momentum: 0.000000
97
+ 2024-03-26 15:24:19,404 epoch 2 - iter 54/95 - loss 0.42654159 - time (sec): 11.38 - samples/sec: 1814.83 - lr: 0.000028 - momentum: 0.000000
98
+ 2024-03-26 15:24:20,715 epoch 2 - iter 63/95 - loss 0.42677167 - time (sec): 12.69 - samples/sec: 1856.69 - lr: 0.000028 - momentum: 0.000000
99
+ 2024-03-26 15:24:22,043 epoch 2 - iter 72/95 - loss 0.41373855 - time (sec): 14.02 - samples/sec: 1887.33 - lr: 0.000028 - momentum: 0.000000
100
+ 2024-03-26 15:24:23,845 epoch 2 - iter 81/95 - loss 0.40294984 - time (sec): 15.82 - samples/sec: 1871.34 - lr: 0.000027 - momentum: 0.000000
101
+ 2024-03-26 15:24:25,501 epoch 2 - iter 90/95 - loss 0.39552323 - time (sec): 17.47 - samples/sec: 1867.73 - lr: 0.000027 - momentum: 0.000000
102
+ 2024-03-26 15:24:26,434 ----------------------------------------------------------------------------------------------------
103
+ 2024-03-26 15:24:26,434 EPOCH 2 done: loss 0.3899 - lr: 0.000027
104
+ 2024-03-26 15:24:27,345 DEV : loss 0.2915351390838623 - f1-score (micro avg) 0.8051
105
+ 2024-03-26 15:24:27,347 saving best model
106
+ 2024-03-26 15:24:27,809 ----------------------------------------------------------------------------------------------------
107
+ 2024-03-26 15:24:29,750 epoch 3 - iter 9/95 - loss 0.33132299 - time (sec): 1.94 - samples/sec: 1730.47 - lr: 0.000026 - momentum: 0.000000
108
+ 2024-03-26 15:24:31,672 epoch 3 - iter 18/95 - loss 0.27912755 - time (sec): 3.86 - samples/sec: 1742.68 - lr: 0.000026 - momentum: 0.000000
109
+ 2024-03-26 15:24:33,024 epoch 3 - iter 27/95 - loss 0.26573965 - time (sec): 5.21 - samples/sec: 1835.11 - lr: 0.000026 - momentum: 0.000000
110
+ 2024-03-26 15:24:35,491 epoch 3 - iter 36/95 - loss 0.25234694 - time (sec): 7.68 - samples/sec: 1760.44 - lr: 0.000025 - momentum: 0.000000
111
+ 2024-03-26 15:24:37,721 epoch 3 - iter 45/95 - loss 0.24175513 - time (sec): 9.91 - samples/sec: 1791.44 - lr: 0.000025 - momentum: 0.000000
112
+ 2024-03-26 15:24:38,903 epoch 3 - iter 54/95 - loss 0.23630673 - time (sec): 11.09 - samples/sec: 1847.22 - lr: 0.000025 - momentum: 0.000000
113
+ 2024-03-26 15:24:40,821 epoch 3 - iter 63/95 - loss 0.22822548 - time (sec): 13.01 - samples/sec: 1830.87 - lr: 0.000025 - momentum: 0.000000
114
+ 2024-03-26 15:24:42,439 epoch 3 - iter 72/95 - loss 0.21717616 - time (sec): 14.63 - samples/sec: 1836.05 - lr: 0.000024 - momentum: 0.000000
115
+ 2024-03-26 15:24:44,180 epoch 3 - iter 81/95 - loss 0.21748434 - time (sec): 16.37 - samples/sec: 1827.41 - lr: 0.000024 - momentum: 0.000000
116
+ 2024-03-26 15:24:46,347 epoch 3 - iter 90/95 - loss 0.20897001 - time (sec): 18.54 - samples/sec: 1797.14 - lr: 0.000024 - momentum: 0.000000
117
+ 2024-03-26 15:24:46,823 ----------------------------------------------------------------------------------------------------
118
+ 2024-03-26 15:24:46,823 EPOCH 3 done: loss 0.2085 - lr: 0.000024
119
+ 2024-03-26 15:24:47,721 DEV : loss 0.2427646666765213 - f1-score (micro avg) 0.8686
120
+ 2024-03-26 15:24:47,722 saving best model
121
+ 2024-03-26 15:24:48,166 ----------------------------------------------------------------------------------------------------
122
+ 2024-03-26 15:24:49,762 epoch 4 - iter 9/95 - loss 0.17196514 - time (sec): 1.60 - samples/sec: 2018.98 - lr: 0.000023 - momentum: 0.000000
123
+ 2024-03-26 15:24:51,788 epoch 4 - iter 18/95 - loss 0.14921918 - time (sec): 3.62 - samples/sec: 1780.76 - lr: 0.000023 - momentum: 0.000000
124
+ 2024-03-26 15:24:53,569 epoch 4 - iter 27/95 - loss 0.15312969 - time (sec): 5.40 - samples/sec: 1803.35 - lr: 0.000022 - momentum: 0.000000
125
+ 2024-03-26 15:24:56,120 epoch 4 - iter 36/95 - loss 0.13017501 - time (sec): 7.95 - samples/sec: 1732.26 - lr: 0.000022 - momentum: 0.000000
126
+ 2024-03-26 15:24:57,807 epoch 4 - iter 45/95 - loss 0.13818847 - time (sec): 9.64 - samples/sec: 1751.27 - lr: 0.000022 - momentum: 0.000000
127
+ 2024-03-26 15:24:59,342 epoch 4 - iter 54/95 - loss 0.13889360 - time (sec): 11.18 - samples/sec: 1804.85 - lr: 0.000022 - momentum: 0.000000
128
+ 2024-03-26 15:25:01,194 epoch 4 - iter 63/95 - loss 0.14186955 - time (sec): 13.03 - samples/sec: 1827.36 - lr: 0.000021 - momentum: 0.000000
129
+ 2024-03-26 15:25:02,476 epoch 4 - iter 72/95 - loss 0.14233703 - time (sec): 14.31 - samples/sec: 1856.74 - lr: 0.000021 - momentum: 0.000000
130
+ 2024-03-26 15:25:04,195 epoch 4 - iter 81/95 - loss 0.14163043 - time (sec): 16.03 - samples/sec: 1846.28 - lr: 0.000021 - momentum: 0.000000
131
+ 2024-03-26 15:25:05,685 epoch 4 - iter 90/95 - loss 0.14066665 - time (sec): 17.52 - samples/sec: 1867.47 - lr: 0.000020 - momentum: 0.000000
132
+ 2024-03-26 15:25:06,586 ----------------------------------------------------------------------------------------------------
133
+ 2024-03-26 15:25:06,586 EPOCH 4 done: loss 0.1405 - lr: 0.000020
134
+ 2024-03-26 15:25:07,483 DEV : loss 0.19904547929763794 - f1-score (micro avg) 0.8939
135
+ 2024-03-26 15:25:07,484 saving best model
136
+ 2024-03-26 15:25:07,946 ----------------------------------------------------------------------------------------------------
137
+ 2024-03-26 15:25:09,696 epoch 5 - iter 9/95 - loss 0.10174251 - time (sec): 1.75 - samples/sec: 1809.09 - lr: 0.000020 - momentum: 0.000000
138
+ 2024-03-26 15:25:11,832 epoch 5 - iter 18/95 - loss 0.10059690 - time (sec): 3.89 - samples/sec: 1725.18 - lr: 0.000019 - momentum: 0.000000
139
+ 2024-03-26 15:25:13,395 epoch 5 - iter 27/95 - loss 0.09581650 - time (sec): 5.45 - samples/sec: 1780.52 - lr: 0.000019 - momentum: 0.000000
140
+ 2024-03-26 15:25:15,068 epoch 5 - iter 36/95 - loss 0.09741828 - time (sec): 7.12 - samples/sec: 1771.45 - lr: 0.000019 - momentum: 0.000000
141
+ 2024-03-26 15:25:16,736 epoch 5 - iter 45/95 - loss 0.11306233 - time (sec): 8.79 - samples/sec: 1825.26 - lr: 0.000019 - momentum: 0.000000
142
+ 2024-03-26 15:25:18,335 epoch 5 - iter 54/95 - loss 0.11499609 - time (sec): 10.39 - samples/sec: 1872.32 - lr: 0.000018 - momentum: 0.000000
143
+ 2024-03-26 15:25:20,176 epoch 5 - iter 63/95 - loss 0.11069119 - time (sec): 12.23 - samples/sec: 1852.58 - lr: 0.000018 - momentum: 0.000000
144
+ 2024-03-26 15:25:22,399 epoch 5 - iter 72/95 - loss 0.10254022 - time (sec): 14.45 - samples/sec: 1877.67 - lr: 0.000018 - momentum: 0.000000
145
+ 2024-03-26 15:25:23,644 epoch 5 - iter 81/95 - loss 0.10177281 - time (sec): 15.70 - samples/sec: 1897.07 - lr: 0.000017 - momentum: 0.000000
146
+ 2024-03-26 15:25:25,787 epoch 5 - iter 90/95 - loss 0.09762038 - time (sec): 17.84 - samples/sec: 1856.20 - lr: 0.000017 - momentum: 0.000000
147
+ 2024-03-26 15:25:26,417 ----------------------------------------------------------------------------------------------------
148
+ 2024-03-26 15:25:26,417 EPOCH 5 done: loss 0.0982 - lr: 0.000017
149
+ 2024-03-26 15:25:27,324 DEV : loss 0.18307699263095856 - f1-score (micro avg) 0.9073
150
+ 2024-03-26 15:25:27,325 saving best model
151
+ 2024-03-26 15:25:27,804 ----------------------------------------------------------------------------------------------------
152
+ 2024-03-26 15:25:29,385 epoch 6 - iter 9/95 - loss 0.03895202 - time (sec): 1.58 - samples/sec: 1829.62 - lr: 0.000016 - momentum: 0.000000
153
+ 2024-03-26 15:25:31,383 epoch 6 - iter 18/95 - loss 0.05992809 - time (sec): 3.58 - samples/sec: 1833.81 - lr: 0.000016 - momentum: 0.000000
154
+ 2024-03-26 15:25:33,061 epoch 6 - iter 27/95 - loss 0.06934296 - time (sec): 5.26 - samples/sec: 1870.23 - lr: 0.000016 - momentum: 0.000000
155
+ 2024-03-26 15:25:34,708 epoch 6 - iter 36/95 - loss 0.06556657 - time (sec): 6.90 - samples/sec: 1835.76 - lr: 0.000016 - momentum: 0.000000
156
+ 2024-03-26 15:25:36,306 epoch 6 - iter 45/95 - loss 0.06874063 - time (sec): 8.50 - samples/sec: 1849.94 - lr: 0.000015 - momentum: 0.000000
157
+ 2024-03-26 15:25:38,304 epoch 6 - iter 54/95 - loss 0.07341583 - time (sec): 10.50 - samples/sec: 1831.20 - lr: 0.000015 - momentum: 0.000000
158
+ 2024-03-26 15:25:39,875 epoch 6 - iter 63/95 - loss 0.07526510 - time (sec): 12.07 - samples/sec: 1831.58 - lr: 0.000015 - momentum: 0.000000
159
+ 2024-03-26 15:25:42,666 epoch 6 - iter 72/95 - loss 0.07039873 - time (sec): 14.86 - samples/sec: 1794.44 - lr: 0.000014 - momentum: 0.000000
160
+ 2024-03-26 15:25:44,514 epoch 6 - iter 81/95 - loss 0.07000601 - time (sec): 16.71 - samples/sec: 1802.30 - lr: 0.000014 - momentum: 0.000000
161
+ 2024-03-26 15:25:46,183 epoch 6 - iter 90/95 - loss 0.07083690 - time (sec): 18.38 - samples/sec: 1796.30 - lr: 0.000014 - momentum: 0.000000
162
+ 2024-03-26 15:25:46,793 ----------------------------------------------------------------------------------------------------
163
+ 2024-03-26 15:25:46,793 EPOCH 6 done: loss 0.0725 - lr: 0.000014
164
+ 2024-03-26 15:25:47,703 DEV : loss 0.18871811032295227 - f1-score (micro avg) 0.9027
165
+ 2024-03-26 15:25:47,704 ----------------------------------------------------------------------------------------------------
166
+ 2024-03-26 15:25:49,029 epoch 7 - iter 9/95 - loss 0.08815916 - time (sec): 1.32 - samples/sec: 2233.22 - lr: 0.000013 - momentum: 0.000000
167
+ 2024-03-26 15:25:50,646 epoch 7 - iter 18/95 - loss 0.07169953 - time (sec): 2.94 - samples/sec: 1996.51 - lr: 0.000013 - momentum: 0.000000
168
+ 2024-03-26 15:25:52,445 epoch 7 - iter 27/95 - loss 0.07522697 - time (sec): 4.74 - samples/sec: 1928.44 - lr: 0.000013 - momentum: 0.000000
169
+ 2024-03-26 15:25:54,312 epoch 7 - iter 36/95 - loss 0.06714080 - time (sec): 6.61 - samples/sec: 1893.53 - lr: 0.000012 - momentum: 0.000000
170
+ 2024-03-26 15:25:56,599 epoch 7 - iter 45/95 - loss 0.06055062 - time (sec): 8.89 - samples/sec: 1842.49 - lr: 0.000012 - momentum: 0.000000
171
+ 2024-03-26 15:25:57,578 epoch 7 - iter 54/95 - loss 0.06119999 - time (sec): 9.87 - samples/sec: 1918.84 - lr: 0.000012 - momentum: 0.000000
172
+ 2024-03-26 15:25:59,426 epoch 7 - iter 63/95 - loss 0.05722868 - time (sec): 11.72 - samples/sec: 1919.19 - lr: 0.000011 - momentum: 0.000000
173
+ 2024-03-26 15:26:01,331 epoch 7 - iter 72/95 - loss 0.05507776 - time (sec): 13.63 - samples/sec: 1879.65 - lr: 0.000011 - momentum: 0.000000
174
+ 2024-03-26 15:26:03,267 epoch 7 - iter 81/95 - loss 0.05448703 - time (sec): 15.56 - samples/sec: 1876.13 - lr: 0.000011 - momentum: 0.000000
175
+ 2024-03-26 15:26:05,199 epoch 7 - iter 90/95 - loss 0.05399226 - time (sec): 17.49 - samples/sec: 1879.34 - lr: 0.000010 - momentum: 0.000000
176
+ 2024-03-26 15:26:06,024 ----------------------------------------------------------------------------------------------------
177
+ 2024-03-26 15:26:06,025 EPOCH 7 done: loss 0.0539 - lr: 0.000010
178
+ 2024-03-26 15:26:06,957 DEV : loss 0.18794356286525726 - f1-score (micro avg) 0.9148
179
+ 2024-03-26 15:26:06,958 saving best model
180
+ 2024-03-26 15:26:07,419 ----------------------------------------------------------------------------------------------------
181
+ 2024-03-26 15:26:09,010 epoch 8 - iter 9/95 - loss 0.05739097 - time (sec): 1.59 - samples/sec: 1880.34 - lr: 0.000010 - momentum: 0.000000
182
+ 2024-03-26 15:26:11,015 epoch 8 - iter 18/95 - loss 0.05214964 - time (sec): 3.60 - samples/sec: 1691.16 - lr: 0.000010 - momentum: 0.000000
183
+ 2024-03-26 15:26:12,574 epoch 8 - iter 27/95 - loss 0.05666378 - time (sec): 5.15 - samples/sec: 1785.67 - lr: 0.000009 - momentum: 0.000000
184
+ 2024-03-26 15:26:14,284 epoch 8 - iter 36/95 - loss 0.05500534 - time (sec): 6.87 - samples/sec: 1833.93 - lr: 0.000009 - momentum: 0.000000
185
+ 2024-03-26 15:26:16,576 epoch 8 - iter 45/95 - loss 0.04735151 - time (sec): 9.16 - samples/sec: 1815.50 - lr: 0.000009 - momentum: 0.000000
186
+ 2024-03-26 15:26:18,867 epoch 8 - iter 54/95 - loss 0.04751933 - time (sec): 11.45 - samples/sec: 1819.00 - lr: 0.000008 - momentum: 0.000000
187
+ 2024-03-26 15:26:20,811 epoch 8 - iter 63/95 - loss 0.04909725 - time (sec): 13.39 - samples/sec: 1823.14 - lr: 0.000008 - momentum: 0.000000
188
+ 2024-03-26 15:26:21,890 epoch 8 - iter 72/95 - loss 0.04820220 - time (sec): 14.47 - samples/sec: 1855.66 - lr: 0.000008 - momentum: 0.000000
189
+ 2024-03-26 15:26:23,542 epoch 8 - iter 81/95 - loss 0.04660140 - time (sec): 16.12 - samples/sec: 1840.90 - lr: 0.000007 - momentum: 0.000000
190
+ 2024-03-26 15:26:24,903 epoch 8 - iter 90/95 - loss 0.04558973 - time (sec): 17.48 - samples/sec: 1856.55 - lr: 0.000007 - momentum: 0.000000
191
+ 2024-03-26 15:26:26,112 ----------------------------------------------------------------------------------------------------
192
+ 2024-03-26 15:26:26,112 EPOCH 8 done: loss 0.0478 - lr: 0.000007
193
+ 2024-03-26 15:26:27,022 DEV : loss 0.1870705783367157 - f1-score (micro avg) 0.924
194
+ 2024-03-26 15:26:27,025 saving best model
195
+ 2024-03-26 15:26:27,485 ----------------------------------------------------------------------------------------------------
196
+ 2024-03-26 15:26:29,237 epoch 9 - iter 9/95 - loss 0.02106672 - time (sec): 1.75 - samples/sec: 1983.41 - lr: 0.000007 - momentum: 0.000000
197
+ 2024-03-26 15:26:31,152 epoch 9 - iter 18/95 - loss 0.02225839 - time (sec): 3.67 - samples/sec: 1843.01 - lr: 0.000006 - momentum: 0.000000
198
+ 2024-03-26 15:26:32,991 epoch 9 - iter 27/95 - loss 0.02596204 - time (sec): 5.51 - samples/sec: 1784.61 - lr: 0.000006 - momentum: 0.000000
199
+ 2024-03-26 15:26:34,917 epoch 9 - iter 36/95 - loss 0.03410280 - time (sec): 7.43 - samples/sec: 1811.69 - lr: 0.000006 - momentum: 0.000000
200
+ 2024-03-26 15:26:36,789 epoch 9 - iter 45/95 - loss 0.03413443 - time (sec): 9.30 - samples/sec: 1792.29 - lr: 0.000005 - momentum: 0.000000
201
+ 2024-03-26 15:26:38,645 epoch 9 - iter 54/95 - loss 0.03408301 - time (sec): 11.16 - samples/sec: 1822.88 - lr: 0.000005 - momentum: 0.000000
202
+ 2024-03-26 15:26:40,516 epoch 9 - iter 63/95 - loss 0.03420543 - time (sec): 13.03 - samples/sec: 1822.30 - lr: 0.000005 - momentum: 0.000000
203
+ 2024-03-26 15:26:42,088 epoch 9 - iter 72/95 - loss 0.03587913 - time (sec): 14.60 - samples/sec: 1833.40 - lr: 0.000004 - momentum: 0.000000
204
+ 2024-03-26 15:26:43,788 epoch 9 - iter 81/95 - loss 0.03734644 - time (sec): 16.30 - samples/sec: 1824.04 - lr: 0.000004 - momentum: 0.000000
205
+ 2024-03-26 15:26:45,538 epoch 9 - iter 90/95 - loss 0.03562347 - time (sec): 18.05 - samples/sec: 1841.38 - lr: 0.000004 - momentum: 0.000000
206
+ 2024-03-26 15:26:46,038 ----------------------------------------------------------------------------------------------------
207
+ 2024-03-26 15:26:46,038 EPOCH 9 done: loss 0.0360 - lr: 0.000004
208
+ 2024-03-26 15:26:46,937 DEV : loss 0.194667786359787 - f1-score (micro avg) 0.9249
209
+ 2024-03-26 15:26:46,938 saving best model
210
+ 2024-03-26 15:26:47,393 ----------------------------------------------------------------------------------------------------
211
+ 2024-03-26 15:26:48,863 epoch 10 - iter 9/95 - loss 0.01764166 - time (sec): 1.47 - samples/sec: 1891.64 - lr: 0.000003 - momentum: 0.000000
212
+ 2024-03-26 15:26:50,677 epoch 10 - iter 18/95 - loss 0.02086608 - time (sec): 3.28 - samples/sec: 1841.60 - lr: 0.000003 - momentum: 0.000000
213
+ 2024-03-26 15:26:52,806 epoch 10 - iter 27/95 - loss 0.02960285 - time (sec): 5.41 - samples/sec: 1786.52 - lr: 0.000003 - momentum: 0.000000
214
+ 2024-03-26 15:26:54,657 epoch 10 - iter 36/95 - loss 0.03318762 - time (sec): 7.26 - samples/sec: 1806.19 - lr: 0.000002 - momentum: 0.000000
215
+ 2024-03-26 15:26:55,822 epoch 10 - iter 45/95 - loss 0.03215798 - time (sec): 8.43 - samples/sec: 1860.01 - lr: 0.000002 - momentum: 0.000000
216
+ 2024-03-26 15:26:57,715 epoch 10 - iter 54/95 - loss 0.03308007 - time (sec): 10.32 - samples/sec: 1844.85 - lr: 0.000002 - momentum: 0.000000
217
+ 2024-03-26 15:26:59,093 epoch 10 - iter 63/95 - loss 0.03432878 - time (sec): 11.70 - samples/sec: 1857.55 - lr: 0.000001 - momentum: 0.000000
218
+ 2024-03-26 15:27:01,335 epoch 10 - iter 72/95 - loss 0.02991145 - time (sec): 13.94 - samples/sec: 1837.36 - lr: 0.000001 - momentum: 0.000000
219
+ 2024-03-26 15:27:03,633 epoch 10 - iter 81/95 - loss 0.03419502 - time (sec): 16.24 - samples/sec: 1818.46 - lr: 0.000001 - momentum: 0.000000
220
+ 2024-03-26 15:27:05,471 epoch 10 - iter 90/95 - loss 0.03200818 - time (sec): 18.08 - samples/sec: 1810.90 - lr: 0.000000 - momentum: 0.000000
221
+ 2024-03-26 15:27:06,480 ----------------------------------------------------------------------------------------------------
222
+ 2024-03-26 15:27:06,480 EPOCH 10 done: loss 0.0310 - lr: 0.000000
223
+ 2024-03-26 15:27:07,403 DEV : loss 0.1904587596654892 - f1-score (micro avg) 0.9336
224
+ 2024-03-26 15:27:07,404 saving best model
225
+ 2024-03-26 15:27:08,189 ----------------------------------------------------------------------------------------------------
226
+ 2024-03-26 15:27:08,189 Loading model from best epoch ...
227
+ 2024-03-26 15:27:09,080 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
228
+ 2024-03-26 15:27:09,836
229
+ Results:
230
+ - F-score (micro) 0.9121
231
+ - F-score (macro) 0.6924
232
+ - Accuracy 0.8408
233
+
234
+ By class:
235
+ precision recall f1-score support
236
+
237
+ Unternehmen 0.9147 0.8872 0.9008 266
238
+ Auslagerung 0.8707 0.9197 0.8945 249
239
+ Ort 0.9635 0.9851 0.9742 134
240
+ Software 0.0000 0.0000 0.0000 0
241
+
242
+ micro avg 0.9045 0.9199 0.9121 649
243
+ macro avg 0.6872 0.6980 0.6924 649
244
+ weighted avg 0.9079 0.9199 0.9135 649
245
+
246
+ 2024-03-26 15:27:09,836 ----------------------------------------------------------------------------------------------------