stefan-it commited on
Commit
35dd799
1 Parent(s): dad30b9

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +245 -0
training.log ADDED
@@ -0,0 +1,245 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-03-26 10:12:07,550 ----------------------------------------------------------------------------------------------------
2
+ 2024-03-26 10:12:07,551 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(31103, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
51
+ 2024-03-26 10:12:07,551 Corpus: 758 train + 94 dev + 96 test sentences
52
+ 2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
53
+ 2024-03-26 10:12:07,551 Train: 758 sentences
54
+ 2024-03-26 10:12:07,551 (train_with_dev=False, train_with_test=False)
55
+ 2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
56
+ 2024-03-26 10:12:07,551 Training Params:
57
+ 2024-03-26 10:12:07,551 - learning_rate: "5e-05"
58
+ 2024-03-26 10:12:07,551 - mini_batch_size: "8"
59
+ 2024-03-26 10:12:07,551 - max_epochs: "10"
60
+ 2024-03-26 10:12:07,551 - shuffle: "True"
61
+ 2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
62
+ 2024-03-26 10:12:07,551 Plugins:
63
+ 2024-03-26 10:12:07,551 - TensorboardLogger
64
+ 2024-03-26 10:12:07,551 - LinearScheduler | warmup_fraction: '0.1'
65
+ 2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
66
+ 2024-03-26 10:12:07,551 Final evaluation on model from best epoch (best-model.pt)
67
+ 2024-03-26 10:12:07,551 - metric: "('micro avg', 'f1-score')"
68
+ 2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
69
+ 2024-03-26 10:12:07,551 Computation:
70
+ 2024-03-26 10:12:07,551 - compute on device: cuda:0
71
+ 2024-03-26 10:12:07,551 - embedding storage: none
72
+ 2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
73
+ 2024-03-26 10:12:07,551 Model training base path: "flair-co-funer-gbert_base-bs8-e10-lr5e-05-3"
74
+ 2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
75
+ 2024-03-26 10:12:07,551 ----------------------------------------------------------------------------------------------------
76
+ 2024-03-26 10:12:07,551 Logging anything other than scalars to TensorBoard is currently not supported.
77
+ 2024-03-26 10:12:08,913 epoch 1 - iter 9/95 - loss 3.30878654 - time (sec): 1.36 - samples/sec: 2342.42 - lr: 0.000004 - momentum: 0.000000
78
+ 2024-03-26 10:12:10,727 epoch 1 - iter 18/95 - loss 3.10425395 - time (sec): 3.18 - samples/sec: 1988.82 - lr: 0.000009 - momentum: 0.000000
79
+ 2024-03-26 10:12:12,640 epoch 1 - iter 27/95 - loss 2.80107707 - time (sec): 5.09 - samples/sec: 1941.49 - lr: 0.000014 - momentum: 0.000000
80
+ 2024-03-26 10:12:14,009 epoch 1 - iter 36/95 - loss 2.56832707 - time (sec): 6.46 - samples/sec: 1959.16 - lr: 0.000018 - momentum: 0.000000
81
+ 2024-03-26 10:12:15,906 epoch 1 - iter 45/95 - loss 2.38441941 - time (sec): 8.35 - samples/sec: 1941.76 - lr: 0.000023 - momentum: 0.000000
82
+ 2024-03-26 10:12:17,254 epoch 1 - iter 54/95 - loss 2.23503114 - time (sec): 9.70 - samples/sec: 1967.22 - lr: 0.000028 - momentum: 0.000000
83
+ 2024-03-26 10:12:18,501 epoch 1 - iter 63/95 - loss 2.10127196 - time (sec): 10.95 - samples/sec: 1994.22 - lr: 0.000033 - momentum: 0.000000
84
+ 2024-03-26 10:12:20,434 epoch 1 - iter 72/95 - loss 1.93262394 - time (sec): 12.88 - samples/sec: 1980.87 - lr: 0.000037 - momentum: 0.000000
85
+ 2024-03-26 10:12:22,397 epoch 1 - iter 81/95 - loss 1.77367380 - time (sec): 14.85 - samples/sec: 1966.51 - lr: 0.000042 - momentum: 0.000000
86
+ 2024-03-26 10:12:23,920 epoch 1 - iter 90/95 - loss 1.66263281 - time (sec): 16.37 - samples/sec: 1984.51 - lr: 0.000047 - momentum: 0.000000
87
+ 2024-03-26 10:12:24,965 ----------------------------------------------------------------------------------------------------
88
+ 2024-03-26 10:12:24,965 EPOCH 1 done: loss 1.5923 - lr: 0.000047
89
+ 2024-03-26 10:12:25,850 DEV : loss 0.4564521014690399 - f1-score (micro avg) 0.6914
90
+ 2024-03-26 10:12:25,851 saving best model
91
+ 2024-03-26 10:12:26,114 ----------------------------------------------------------------------------------------------------
92
+ 2024-03-26 10:12:27,479 epoch 2 - iter 9/95 - loss 0.52426643 - time (sec): 1.36 - samples/sec: 2007.72 - lr: 0.000050 - momentum: 0.000000
93
+ 2024-03-26 10:12:29,308 epoch 2 - iter 18/95 - loss 0.41660600 - time (sec): 3.19 - samples/sec: 1913.27 - lr: 0.000049 - momentum: 0.000000
94
+ 2024-03-26 10:12:30,481 epoch 2 - iter 27/95 - loss 0.40217596 - time (sec): 4.37 - samples/sec: 1965.47 - lr: 0.000048 - momentum: 0.000000
95
+ 2024-03-26 10:12:32,723 epoch 2 - iter 36/95 - loss 0.38722517 - time (sec): 6.61 - samples/sec: 1918.56 - lr: 0.000048 - momentum: 0.000000
96
+ 2024-03-26 10:12:34,656 epoch 2 - iter 45/95 - loss 0.37676757 - time (sec): 8.54 - samples/sec: 1925.24 - lr: 0.000047 - momentum: 0.000000
97
+ 2024-03-26 10:12:36,809 epoch 2 - iter 54/95 - loss 0.36567606 - time (sec): 10.69 - samples/sec: 1897.06 - lr: 0.000047 - momentum: 0.000000
98
+ 2024-03-26 10:12:38,805 epoch 2 - iter 63/95 - loss 0.35177076 - time (sec): 12.69 - samples/sec: 1852.39 - lr: 0.000046 - momentum: 0.000000
99
+ 2024-03-26 10:12:40,311 epoch 2 - iter 72/95 - loss 0.35231443 - time (sec): 14.20 - samples/sec: 1860.87 - lr: 0.000046 - momentum: 0.000000
100
+ 2024-03-26 10:12:41,751 epoch 2 - iter 81/95 - loss 0.35619563 - time (sec): 15.64 - samples/sec: 1883.95 - lr: 0.000045 - momentum: 0.000000
101
+ 2024-03-26 10:12:43,960 epoch 2 - iter 90/95 - loss 0.34268426 - time (sec): 17.85 - samples/sec: 1858.76 - lr: 0.000045 - momentum: 0.000000
102
+ 2024-03-26 10:12:44,600 ----------------------------------------------------------------------------------------------------
103
+ 2024-03-26 10:12:44,600 EPOCH 2 done: loss 0.3389 - lr: 0.000045
104
+ 2024-03-26 10:12:45,487 DEV : loss 0.2517702877521515 - f1-score (micro avg) 0.8598
105
+ 2024-03-26 10:12:45,488 saving best model
106
+ 2024-03-26 10:12:45,916 ----------------------------------------------------------------------------------------------------
107
+ 2024-03-26 10:12:47,544 epoch 3 - iter 9/95 - loss 0.18626350 - time (sec): 1.63 - samples/sec: 1836.16 - lr: 0.000044 - momentum: 0.000000
108
+ 2024-03-26 10:12:49,324 epoch 3 - iter 18/95 - loss 0.16963537 - time (sec): 3.41 - samples/sec: 1858.50 - lr: 0.000043 - momentum: 0.000000
109
+ 2024-03-26 10:12:50,515 epoch 3 - iter 27/95 - loss 0.18166954 - time (sec): 4.60 - samples/sec: 2032.19 - lr: 0.000043 - momentum: 0.000000
110
+ 2024-03-26 10:12:52,065 epoch 3 - iter 36/95 - loss 0.17639533 - time (sec): 6.15 - samples/sec: 2020.76 - lr: 0.000042 - momentum: 0.000000
111
+ 2024-03-26 10:12:53,470 epoch 3 - iter 45/95 - loss 0.17991451 - time (sec): 7.55 - samples/sec: 2029.39 - lr: 0.000042 - momentum: 0.000000
112
+ 2024-03-26 10:12:55,457 epoch 3 - iter 54/95 - loss 0.17836259 - time (sec): 9.54 - samples/sec: 1981.19 - lr: 0.000041 - momentum: 0.000000
113
+ 2024-03-26 10:12:57,451 epoch 3 - iter 63/95 - loss 0.17801210 - time (sec): 11.53 - samples/sec: 1930.20 - lr: 0.000041 - momentum: 0.000000
114
+ 2024-03-26 10:12:59,290 epoch 3 - iter 72/95 - loss 0.17908359 - time (sec): 13.37 - samples/sec: 1912.74 - lr: 0.000040 - momentum: 0.000000
115
+ 2024-03-26 10:13:01,293 epoch 3 - iter 81/95 - loss 0.17473687 - time (sec): 15.38 - samples/sec: 1885.59 - lr: 0.000040 - momentum: 0.000000
116
+ 2024-03-26 10:13:03,243 epoch 3 - iter 90/95 - loss 0.18331440 - time (sec): 17.33 - samples/sec: 1887.20 - lr: 0.000039 - momentum: 0.000000
117
+ 2024-03-26 10:13:04,335 ----------------------------------------------------------------------------------------------------
118
+ 2024-03-26 10:13:04,336 EPOCH 3 done: loss 0.1780 - lr: 0.000039
119
+ 2024-03-26 10:13:05,235 DEV : loss 0.21343179047107697 - f1-score (micro avg) 0.8682
120
+ 2024-03-26 10:13:05,236 saving best model
121
+ 2024-03-26 10:13:05,669 ----------------------------------------------------------------------------------------------------
122
+ 2024-03-26 10:13:06,950 epoch 4 - iter 9/95 - loss 0.12934404 - time (sec): 1.28 - samples/sec: 2171.15 - lr: 0.000039 - momentum: 0.000000
123
+ 2024-03-26 10:13:08,782 epoch 4 - iter 18/95 - loss 0.11598588 - time (sec): 3.11 - samples/sec: 1975.63 - lr: 0.000038 - momentum: 0.000000
124
+ 2024-03-26 10:13:10,695 epoch 4 - iter 27/95 - loss 0.11184561 - time (sec): 5.02 - samples/sec: 1922.31 - lr: 0.000037 - momentum: 0.000000
125
+ 2024-03-26 10:13:12,169 epoch 4 - iter 36/95 - loss 0.10779297 - time (sec): 6.50 - samples/sec: 1930.71 - lr: 0.000037 - momentum: 0.000000
126
+ 2024-03-26 10:13:14,583 epoch 4 - iter 45/95 - loss 0.10822640 - time (sec): 8.91 - samples/sec: 1854.16 - lr: 0.000036 - momentum: 0.000000
127
+ 2024-03-26 10:13:16,423 epoch 4 - iter 54/95 - loss 0.10627905 - time (sec): 10.75 - samples/sec: 1835.61 - lr: 0.000036 - momentum: 0.000000
128
+ 2024-03-26 10:13:18,347 epoch 4 - iter 63/95 - loss 0.10563130 - time (sec): 12.68 - samples/sec: 1814.36 - lr: 0.000035 - momentum: 0.000000
129
+ 2024-03-26 10:13:20,216 epoch 4 - iter 72/95 - loss 0.11061803 - time (sec): 14.54 - samples/sec: 1829.28 - lr: 0.000035 - momentum: 0.000000
130
+ 2024-03-26 10:13:22,230 epoch 4 - iter 81/95 - loss 0.11629637 - time (sec): 16.56 - samples/sec: 1825.85 - lr: 0.000034 - momentum: 0.000000
131
+ 2024-03-26 10:13:23,199 epoch 4 - iter 90/95 - loss 0.11624132 - time (sec): 17.53 - samples/sec: 1865.91 - lr: 0.000034 - momentum: 0.000000
132
+ 2024-03-26 10:13:24,226 ----------------------------------------------------------------------------------------------------
133
+ 2024-03-26 10:13:24,226 EPOCH 4 done: loss 0.1162 - lr: 0.000034
134
+ 2024-03-26 10:13:25,126 DEV : loss 0.17539489269256592 - f1-score (micro avg) 0.9069
135
+ 2024-03-26 10:13:25,128 saving best model
136
+ 2024-03-26 10:13:25,559 ----------------------------------------------------------------------------------------------------
137
+ 2024-03-26 10:13:27,443 epoch 5 - iter 9/95 - loss 0.07744264 - time (sec): 1.88 - samples/sec: 1825.68 - lr: 0.000033 - momentum: 0.000000
138
+ 2024-03-26 10:13:28,876 epoch 5 - iter 18/95 - loss 0.07305274 - time (sec): 3.32 - samples/sec: 1887.57 - lr: 0.000032 - momentum: 0.000000
139
+ 2024-03-26 10:13:30,235 epoch 5 - iter 27/95 - loss 0.08372766 - time (sec): 4.67 - samples/sec: 1931.87 - lr: 0.000032 - momentum: 0.000000
140
+ 2024-03-26 10:13:32,114 epoch 5 - iter 36/95 - loss 0.08762107 - time (sec): 6.55 - samples/sec: 1872.88 - lr: 0.000031 - momentum: 0.000000
141
+ 2024-03-26 10:13:34,321 epoch 5 - iter 45/95 - loss 0.08538004 - time (sec): 8.76 - samples/sec: 1858.35 - lr: 0.000031 - momentum: 0.000000
142
+ 2024-03-26 10:13:36,754 epoch 5 - iter 54/95 - loss 0.08449088 - time (sec): 11.19 - samples/sec: 1814.50 - lr: 0.000030 - momentum: 0.000000
143
+ 2024-03-26 10:13:38,420 epoch 5 - iter 63/95 - loss 0.08194794 - time (sec): 12.86 - samples/sec: 1806.64 - lr: 0.000030 - momentum: 0.000000
144
+ 2024-03-26 10:13:40,188 epoch 5 - iter 72/95 - loss 0.08370354 - time (sec): 14.63 - samples/sec: 1808.03 - lr: 0.000029 - momentum: 0.000000
145
+ 2024-03-26 10:13:42,386 epoch 5 - iter 81/95 - loss 0.08624660 - time (sec): 16.83 - samples/sec: 1793.37 - lr: 0.000029 - momentum: 0.000000
146
+ 2024-03-26 10:13:43,778 epoch 5 - iter 90/95 - loss 0.08725992 - time (sec): 18.22 - samples/sec: 1809.23 - lr: 0.000028 - momentum: 0.000000
147
+ 2024-03-26 10:13:44,549 ----------------------------------------------------------------------------------------------------
148
+ 2024-03-26 10:13:44,549 EPOCH 5 done: loss 0.0853 - lr: 0.000028
149
+ 2024-03-26 10:13:45,528 DEV : loss 0.15603235363960266 - f1-score (micro avg) 0.9191
150
+ 2024-03-26 10:13:45,529 saving best model
151
+ 2024-03-26 10:13:45,955 ----------------------------------------------------------------------------------------------------
152
+ 2024-03-26 10:13:47,889 epoch 6 - iter 9/95 - loss 0.05189760 - time (sec): 1.93 - samples/sec: 1805.09 - lr: 0.000027 - momentum: 0.000000
153
+ 2024-03-26 10:13:49,440 epoch 6 - iter 18/95 - loss 0.05290575 - time (sec): 3.48 - samples/sec: 1826.05 - lr: 0.000027 - momentum: 0.000000
154
+ 2024-03-26 10:13:51,348 epoch 6 - iter 27/95 - loss 0.05365903 - time (sec): 5.39 - samples/sec: 1834.17 - lr: 0.000026 - momentum: 0.000000
155
+ 2024-03-26 10:13:52,910 epoch 6 - iter 36/95 - loss 0.05604796 - time (sec): 6.95 - samples/sec: 1834.10 - lr: 0.000026 - momentum: 0.000000
156
+ 2024-03-26 10:13:54,366 epoch 6 - iter 45/95 - loss 0.05515355 - time (sec): 8.41 - samples/sec: 1870.09 - lr: 0.000025 - momentum: 0.000000
157
+ 2024-03-26 10:13:55,813 epoch 6 - iter 54/95 - loss 0.05376542 - time (sec): 9.86 - samples/sec: 1868.77 - lr: 0.000025 - momentum: 0.000000
158
+ 2024-03-26 10:13:57,095 epoch 6 - iter 63/95 - loss 0.05273952 - time (sec): 11.14 - samples/sec: 1930.27 - lr: 0.000024 - momentum: 0.000000
159
+ 2024-03-26 10:13:59,339 epoch 6 - iter 72/95 - loss 0.06092265 - time (sec): 13.38 - samples/sec: 1897.59 - lr: 0.000024 - momentum: 0.000000
160
+ 2024-03-26 10:14:00,927 epoch 6 - iter 81/95 - loss 0.05912352 - time (sec): 14.97 - samples/sec: 1914.17 - lr: 0.000023 - momentum: 0.000000
161
+ 2024-03-26 10:14:02,632 epoch 6 - iter 90/95 - loss 0.06035882 - time (sec): 16.68 - samples/sec: 1932.85 - lr: 0.000023 - momentum: 0.000000
162
+ 2024-03-26 10:14:03,897 ----------------------------------------------------------------------------------------------------
163
+ 2024-03-26 10:14:03,897 EPOCH 6 done: loss 0.0603 - lr: 0.000023
164
+ 2024-03-26 10:14:04,793 DEV : loss 0.16718925535678864 - f1-score (micro avg) 0.9201
165
+ 2024-03-26 10:14:04,794 saving best model
166
+ 2024-03-26 10:14:05,238 ----------------------------------------------------------------------------------------------------
167
+ 2024-03-26 10:14:07,130 epoch 7 - iter 9/95 - loss 0.05321789 - time (sec): 1.89 - samples/sec: 1679.94 - lr: 0.000022 - momentum: 0.000000
168
+ 2024-03-26 10:14:09,176 epoch 7 - iter 18/95 - loss 0.03684477 - time (sec): 3.94 - samples/sec: 1665.45 - lr: 0.000021 - momentum: 0.000000
169
+ 2024-03-26 10:14:10,709 epoch 7 - iter 27/95 - loss 0.03258736 - time (sec): 5.47 - samples/sec: 1788.71 - lr: 0.000021 - momentum: 0.000000
170
+ 2024-03-26 10:14:12,654 epoch 7 - iter 36/95 - loss 0.03305511 - time (sec): 7.41 - samples/sec: 1777.21 - lr: 0.000020 - momentum: 0.000000
171
+ 2024-03-26 10:14:15,035 epoch 7 - iter 45/95 - loss 0.03992535 - time (sec): 9.80 - samples/sec: 1770.36 - lr: 0.000020 - momentum: 0.000000
172
+ 2024-03-26 10:14:16,548 epoch 7 - iter 54/95 - loss 0.03978942 - time (sec): 11.31 - samples/sec: 1778.83 - lr: 0.000019 - momentum: 0.000000
173
+ 2024-03-26 10:14:18,735 epoch 7 - iter 63/95 - loss 0.04213624 - time (sec): 13.49 - samples/sec: 1784.59 - lr: 0.000019 - momentum: 0.000000
174
+ 2024-03-26 10:14:20,525 epoch 7 - iter 72/95 - loss 0.04587297 - time (sec): 15.29 - samples/sec: 1790.89 - lr: 0.000018 - momentum: 0.000000
175
+ 2024-03-26 10:14:21,947 epoch 7 - iter 81/95 - loss 0.04361343 - time (sec): 16.71 - samples/sec: 1802.97 - lr: 0.000018 - momentum: 0.000000
176
+ 2024-03-26 10:14:23,915 epoch 7 - iter 90/95 - loss 0.04536795 - time (sec): 18.68 - samples/sec: 1783.65 - lr: 0.000017 - momentum: 0.000000
177
+ 2024-03-26 10:14:24,399 ----------------------------------------------------------------------------------------------------
178
+ 2024-03-26 10:14:24,399 EPOCH 7 done: loss 0.0462 - lr: 0.000017
179
+ 2024-03-26 10:14:25,307 DEV : loss 0.16716967523097992 - f1-score (micro avg) 0.9411
180
+ 2024-03-26 10:14:25,308 saving best model
181
+ 2024-03-26 10:14:25,747 ----------------------------------------------------------------------------------------------------
182
+ 2024-03-26 10:14:27,627 epoch 8 - iter 9/95 - loss 0.01951604 - time (sec): 1.88 - samples/sec: 1708.16 - lr: 0.000016 - momentum: 0.000000
183
+ 2024-03-26 10:14:30,113 epoch 8 - iter 18/95 - loss 0.01840664 - time (sec): 4.36 - samples/sec: 1697.69 - lr: 0.000016 - momentum: 0.000000
184
+ 2024-03-26 10:14:31,879 epoch 8 - iter 27/95 - loss 0.02281365 - time (sec): 6.13 - samples/sec: 1736.13 - lr: 0.000015 - momentum: 0.000000
185
+ 2024-03-26 10:14:33,422 epoch 8 - iter 36/95 - loss 0.02354290 - time (sec): 7.67 - samples/sec: 1728.02 - lr: 0.000015 - momentum: 0.000000
186
+ 2024-03-26 10:14:34,930 epoch 8 - iter 45/95 - loss 0.02180613 - time (sec): 9.18 - samples/sec: 1759.24 - lr: 0.000014 - momentum: 0.000000
187
+ 2024-03-26 10:14:36,592 epoch 8 - iter 54/95 - loss 0.02226186 - time (sec): 10.84 - samples/sec: 1779.13 - lr: 0.000014 - momentum: 0.000000
188
+ 2024-03-26 10:14:38,783 epoch 8 - iter 63/95 - loss 0.03092783 - time (sec): 13.03 - samples/sec: 1775.62 - lr: 0.000013 - momentum: 0.000000
189
+ 2024-03-26 10:14:41,032 epoch 8 - iter 72/95 - loss 0.03257996 - time (sec): 15.28 - samples/sec: 1756.12 - lr: 0.000013 - momentum: 0.000000
190
+ 2024-03-26 10:14:42,690 epoch 8 - iter 81/95 - loss 0.03612619 - time (sec): 16.94 - samples/sec: 1758.38 - lr: 0.000012 - momentum: 0.000000
191
+ 2024-03-26 10:14:43,981 epoch 8 - iter 90/95 - loss 0.03615101 - time (sec): 18.23 - samples/sec: 1800.86 - lr: 0.000012 - momentum: 0.000000
192
+ 2024-03-26 10:14:44,878 ----------------------------------------------------------------------------------------------------
193
+ 2024-03-26 10:14:44,878 EPOCH 8 done: loss 0.0350 - lr: 0.000012
194
+ 2024-03-26 10:14:45,778 DEV : loss 0.1608782857656479 - f1-score (micro avg) 0.9517
195
+ 2024-03-26 10:14:45,779 saving best model
196
+ 2024-03-26 10:14:46,223 ----------------------------------------------------------------------------------------------------
197
+ 2024-03-26 10:14:48,197 epoch 9 - iter 9/95 - loss 0.01453760 - time (sec): 1.97 - samples/sec: 1787.94 - lr: 0.000011 - momentum: 0.000000
198
+ 2024-03-26 10:14:49,915 epoch 9 - iter 18/95 - loss 0.02446034 - time (sec): 3.69 - samples/sec: 1813.90 - lr: 0.000010 - momentum: 0.000000
199
+ 2024-03-26 10:14:51,796 epoch 9 - iter 27/95 - loss 0.02473284 - time (sec): 5.57 - samples/sec: 1834.03 - lr: 0.000010 - momentum: 0.000000
200
+ 2024-03-26 10:14:53,651 epoch 9 - iter 36/95 - loss 0.02318813 - time (sec): 7.43 - samples/sec: 1828.97 - lr: 0.000009 - momentum: 0.000000
201
+ 2024-03-26 10:14:55,908 epoch 9 - iter 45/95 - loss 0.02045251 - time (sec): 9.68 - samples/sec: 1750.46 - lr: 0.000009 - momentum: 0.000000
202
+ 2024-03-26 10:14:57,835 epoch 9 - iter 54/95 - loss 0.02611207 - time (sec): 11.61 - samples/sec: 1738.74 - lr: 0.000008 - momentum: 0.000000
203
+ 2024-03-26 10:14:59,723 epoch 9 - iter 63/95 - loss 0.02537917 - time (sec): 13.50 - samples/sec: 1749.31 - lr: 0.000008 - momentum: 0.000000
204
+ 2024-03-26 10:15:01,602 epoch 9 - iter 72/95 - loss 0.02461298 - time (sec): 15.38 - samples/sec: 1752.48 - lr: 0.000007 - momentum: 0.000000
205
+ 2024-03-26 10:15:02,853 epoch 9 - iter 81/95 - loss 0.02473164 - time (sec): 16.63 - samples/sec: 1775.52 - lr: 0.000007 - momentum: 0.000000
206
+ 2024-03-26 10:15:04,257 epoch 9 - iter 90/95 - loss 0.02719568 - time (sec): 18.03 - samples/sec: 1797.34 - lr: 0.000006 - momentum: 0.000000
207
+ 2024-03-26 10:15:05,204 ----------------------------------------------------------------------------------------------------
208
+ 2024-03-26 10:15:05,204 EPOCH 9 done: loss 0.0265 - lr: 0.000006
209
+ 2024-03-26 10:15:06,105 DEV : loss 0.18035191297531128 - f1-score (micro avg) 0.9468
210
+ 2024-03-26 10:15:06,106 ----------------------------------------------------------------------------------------------------
211
+ 2024-03-26 10:15:08,246 epoch 10 - iter 9/95 - loss 0.00557531 - time (sec): 2.14 - samples/sec: 1781.65 - lr: 0.000005 - momentum: 0.000000
212
+ 2024-03-26 10:15:09,506 epoch 10 - iter 18/95 - loss 0.00880795 - time (sec): 3.40 - samples/sec: 1904.54 - lr: 0.000005 - momentum: 0.000000
213
+ 2024-03-26 10:15:10,813 epoch 10 - iter 27/95 - loss 0.02526886 - time (sec): 4.71 - samples/sec: 2014.92 - lr: 0.000004 - momentum: 0.000000
214
+ 2024-03-26 10:15:12,166 epoch 10 - iter 36/95 - loss 0.02295782 - time (sec): 6.06 - samples/sec: 2035.49 - lr: 0.000004 - momentum: 0.000000
215
+ 2024-03-26 10:15:14,123 epoch 10 - iter 45/95 - loss 0.01953834 - time (sec): 8.02 - samples/sec: 1986.50 - lr: 0.000003 - momentum: 0.000000
216
+ 2024-03-26 10:15:15,695 epoch 10 - iter 54/95 - loss 0.01921737 - time (sec): 9.59 - samples/sec: 1977.11 - lr: 0.000003 - momentum: 0.000000
217
+ 2024-03-26 10:15:18,215 epoch 10 - iter 63/95 - loss 0.02060934 - time (sec): 12.11 - samples/sec: 1898.74 - lr: 0.000002 - momentum: 0.000000
218
+ 2024-03-26 10:15:19,503 epoch 10 - iter 72/95 - loss 0.01949359 - time (sec): 13.40 - samples/sec: 1903.48 - lr: 0.000002 - momentum: 0.000000
219
+ 2024-03-26 10:15:21,840 epoch 10 - iter 81/95 - loss 0.01845566 - time (sec): 15.73 - samples/sec: 1851.36 - lr: 0.000001 - momentum: 0.000000
220
+ 2024-03-26 10:15:24,060 epoch 10 - iter 90/95 - loss 0.02088092 - time (sec): 17.95 - samples/sec: 1831.79 - lr: 0.000001 - momentum: 0.000000
221
+ 2024-03-26 10:15:25,122 ----------------------------------------------------------------------------------------------------
222
+ 2024-03-26 10:15:25,122 EPOCH 10 done: loss 0.0206 - lr: 0.000001
223
+ 2024-03-26 10:15:26,019 DEV : loss 0.18286916613578796 - f1-score (micro avg) 0.9417
224
+ 2024-03-26 10:15:26,304 ----------------------------------------------------------------------------------------------------
225
+ 2024-03-26 10:15:26,304 Loading model from best epoch ...
226
+ 2024-03-26 10:15:27,155 SequenceTagger predicts: Dictionary with 17 tags: O, S-Unternehmen, B-Unternehmen, E-Unternehmen, I-Unternehmen, S-Auslagerung, B-Auslagerung, E-Auslagerung, I-Auslagerung, S-Ort, B-Ort, E-Ort, I-Ort, S-Software, B-Software, E-Software, I-Software
227
+ 2024-03-26 10:15:27,901
228
+ Results:
229
+ - F-score (micro) 0.9163
230
+ - F-score (macro) 0.6959
231
+ - Accuracy 0.8504
232
+
233
+ By class:
234
+ precision recall f1-score support
235
+
236
+ Unternehmen 0.9173 0.8759 0.8962 266
237
+ Auslagerung 0.8851 0.9277 0.9059 249
238
+ Ort 0.9708 0.9925 0.9815 134
239
+ Software 0.0000 0.0000 0.0000 0
240
+
241
+ micro avg 0.9128 0.9199 0.9163 649
242
+ macro avg 0.6933 0.6990 0.6959 649
243
+ weighted avg 0.9160 0.9199 0.9175 649
244
+
245
+ 2024-03-26 10:15:27,901 ----------------------------------------------------------------------------------------------------