stefan-it commited on
Commit
be4d905
1 Parent(s): 5a6f3a9

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +239 -0
training.log ADDED
@@ -0,0 +1,239 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-25 10:37:21,901 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-25 10:37:21,902 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(64001, 768)
7
+ (position_embeddings): Embedding(512, 768)
8
+ (token_type_embeddings): Embedding(2, 768)
9
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-11): 12 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=768, out_features=768, bias=True)
18
+ (key): Linear(in_features=768, out_features=768, bias=True)
19
+ (value): Linear(in_features=768, out_features=768, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=768, out_features=768, bias=True)
24
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
34
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=768, out_features=768, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=768, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-25 10:37:21,902 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-25 10:37:21,903 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
52
+ - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
53
+ 2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-25 10:37:21,903 Train: 6183 sentences
55
+ 2023-10-25 10:37:21,903 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-25 10:37:21,903 Training Params:
58
+ 2023-10-25 10:37:21,903 - learning_rate: "3e-05"
59
+ 2023-10-25 10:37:21,903 - mini_batch_size: "8"
60
+ 2023-10-25 10:37:21,903 - max_epochs: "10"
61
+ 2023-10-25 10:37:21,903 - shuffle: "True"
62
+ 2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-25 10:37:21,903 Plugins:
64
+ 2023-10-25 10:37:21,903 - TensorboardLogger
65
+ 2023-10-25 10:37:21,903 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-25 10:37:21,903 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-25 10:37:21,903 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-25 10:37:21,903 Computation:
71
+ 2023-10-25 10:37:21,903 - compute on device: cuda:0
72
+ 2023-10-25 10:37:21,903 - embedding storage: none
73
+ 2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-25 10:37:21,903 Model training base path: "hmbench-topres19th/en-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
75
+ 2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-25 10:37:21,903 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-25 10:37:21,903 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-25 10:37:26,605 epoch 1 - iter 77/773 - loss 2.00342293 - time (sec): 4.70 - samples/sec: 2712.01 - lr: 0.000003 - momentum: 0.000000
79
+ 2023-10-25 10:37:31,262 epoch 1 - iter 154/773 - loss 1.14055389 - time (sec): 9.36 - samples/sec: 2662.74 - lr: 0.000006 - momentum: 0.000000
80
+ 2023-10-25 10:37:35,889 epoch 1 - iter 231/773 - loss 0.82625202 - time (sec): 13.98 - samples/sec: 2642.41 - lr: 0.000009 - momentum: 0.000000
81
+ 2023-10-25 10:37:40,575 epoch 1 - iter 308/773 - loss 0.64493288 - time (sec): 18.67 - samples/sec: 2668.34 - lr: 0.000012 - momentum: 0.000000
82
+ 2023-10-25 10:37:45,208 epoch 1 - iter 385/773 - loss 0.53761507 - time (sec): 23.30 - samples/sec: 2669.10 - lr: 0.000015 - momentum: 0.000000
83
+ 2023-10-25 10:37:49,742 epoch 1 - iter 462/773 - loss 0.47184545 - time (sec): 27.84 - samples/sec: 2656.78 - lr: 0.000018 - momentum: 0.000000
84
+ 2023-10-25 10:37:54,392 epoch 1 - iter 539/773 - loss 0.41778036 - time (sec): 32.49 - samples/sec: 2672.72 - lr: 0.000021 - momentum: 0.000000
85
+ 2023-10-25 10:37:59,140 epoch 1 - iter 616/773 - loss 0.37526149 - time (sec): 37.24 - samples/sec: 2675.01 - lr: 0.000024 - momentum: 0.000000
86
+ 2023-10-25 10:38:03,677 epoch 1 - iter 693/773 - loss 0.34438623 - time (sec): 41.77 - samples/sec: 2678.19 - lr: 0.000027 - momentum: 0.000000
87
+ 2023-10-25 10:38:08,219 epoch 1 - iter 770/773 - loss 0.32016433 - time (sec): 46.31 - samples/sec: 2673.41 - lr: 0.000030 - momentum: 0.000000
88
+ 2023-10-25 10:38:08,404 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-25 10:38:08,404 EPOCH 1 done: loss 0.3193 - lr: 0.000030
90
+ 2023-10-25 10:38:11,901 DEV : loss 0.05575157329440117 - f1-score (micro avg) 0.7258
91
+ 2023-10-25 10:38:11,920 saving best model
92
+ 2023-10-25 10:38:12,442 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-25 10:38:17,190 epoch 2 - iter 77/773 - loss 0.06917782 - time (sec): 4.75 - samples/sec: 2589.73 - lr: 0.000030 - momentum: 0.000000
94
+ 2023-10-25 10:38:21,845 epoch 2 - iter 154/773 - loss 0.07295557 - time (sec): 9.40 - samples/sec: 2635.98 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-25 10:38:26,432 epoch 2 - iter 231/773 - loss 0.07445178 - time (sec): 13.99 - samples/sec: 2577.67 - lr: 0.000029 - momentum: 0.000000
96
+ 2023-10-25 10:38:31,123 epoch 2 - iter 308/773 - loss 0.07546608 - time (sec): 18.68 - samples/sec: 2620.82 - lr: 0.000029 - momentum: 0.000000
97
+ 2023-10-25 10:38:35,870 epoch 2 - iter 385/773 - loss 0.07298737 - time (sec): 23.43 - samples/sec: 2630.85 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-25 10:38:40,555 epoch 2 - iter 462/773 - loss 0.07131937 - time (sec): 28.11 - samples/sec: 2642.70 - lr: 0.000028 - momentum: 0.000000
99
+ 2023-10-25 10:38:45,260 epoch 2 - iter 539/773 - loss 0.07085615 - time (sec): 32.82 - samples/sec: 2652.90 - lr: 0.000028 - momentum: 0.000000
100
+ 2023-10-25 10:38:50,000 epoch 2 - iter 616/773 - loss 0.07015877 - time (sec): 37.56 - samples/sec: 2628.79 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-25 10:38:54,537 epoch 2 - iter 693/773 - loss 0.06882789 - time (sec): 42.09 - samples/sec: 2624.41 - lr: 0.000027 - momentum: 0.000000
102
+ 2023-10-25 10:38:59,395 epoch 2 - iter 770/773 - loss 0.06928794 - time (sec): 46.95 - samples/sec: 2635.97 - lr: 0.000027 - momentum: 0.000000
103
+ 2023-10-25 10:38:59,585 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-25 10:38:59,586 EPOCH 2 done: loss 0.0691 - lr: 0.000027
105
+ 2023-10-25 10:39:02,404 DEV : loss 0.049297548830509186 - f1-score (micro avg) 0.8142
106
+ 2023-10-25 10:39:02,422 saving best model
107
+ 2023-10-25 10:39:03,089 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-25 10:39:07,755 epoch 3 - iter 77/773 - loss 0.04741193 - time (sec): 4.66 - samples/sec: 2618.48 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-25 10:39:12,662 epoch 3 - iter 154/773 - loss 0.04548219 - time (sec): 9.57 - samples/sec: 2493.23 - lr: 0.000026 - momentum: 0.000000
110
+ 2023-10-25 10:39:17,185 epoch 3 - iter 231/773 - loss 0.04459085 - time (sec): 14.09 - samples/sec: 2538.42 - lr: 0.000026 - momentum: 0.000000
111
+ 2023-10-25 10:39:21,796 epoch 3 - iter 308/773 - loss 0.04386816 - time (sec): 18.70 - samples/sec: 2585.25 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-25 10:39:26,414 epoch 3 - iter 385/773 - loss 0.04428707 - time (sec): 23.32 - samples/sec: 2595.54 - lr: 0.000025 - momentum: 0.000000
113
+ 2023-10-25 10:39:31,080 epoch 3 - iter 462/773 - loss 0.04386941 - time (sec): 27.99 - samples/sec: 2628.33 - lr: 0.000025 - momentum: 0.000000
114
+ 2023-10-25 10:39:35,843 epoch 3 - iter 539/773 - loss 0.04425161 - time (sec): 32.75 - samples/sec: 2629.03 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-25 10:39:40,653 epoch 3 - iter 616/773 - loss 0.04565354 - time (sec): 37.56 - samples/sec: 2628.00 - lr: 0.000024 - momentum: 0.000000
116
+ 2023-10-25 10:39:45,336 epoch 3 - iter 693/773 - loss 0.04638286 - time (sec): 42.24 - samples/sec: 2640.83 - lr: 0.000024 - momentum: 0.000000
117
+ 2023-10-25 10:39:50,069 epoch 3 - iter 770/773 - loss 0.04580281 - time (sec): 46.98 - samples/sec: 2637.11 - lr: 0.000023 - momentum: 0.000000
118
+ 2023-10-25 10:39:50,249 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-25 10:39:50,249 EPOCH 3 done: loss 0.0457 - lr: 0.000023
120
+ 2023-10-25 10:39:53,011 DEV : loss 0.07478724420070648 - f1-score (micro avg) 0.7705
121
+ 2023-10-25 10:39:53,029 ----------------------------------------------------------------------------------------------------
122
+ 2023-10-25 10:39:57,751 epoch 4 - iter 77/773 - loss 0.02381115 - time (sec): 4.72 - samples/sec: 2644.65 - lr: 0.000023 - momentum: 0.000000
123
+ 2023-10-25 10:40:02,403 epoch 4 - iter 154/773 - loss 0.02272232 - time (sec): 9.37 - samples/sec: 2696.05 - lr: 0.000023 - momentum: 0.000000
124
+ 2023-10-25 10:40:07,048 epoch 4 - iter 231/773 - loss 0.02306162 - time (sec): 14.02 - samples/sec: 2694.29 - lr: 0.000022 - momentum: 0.000000
125
+ 2023-10-25 10:40:11,691 epoch 4 - iter 308/773 - loss 0.02500263 - time (sec): 18.66 - samples/sec: 2695.98 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-25 10:40:16,351 epoch 4 - iter 385/773 - loss 0.02577652 - time (sec): 23.32 - samples/sec: 2669.27 - lr: 0.000022 - momentum: 0.000000
127
+ 2023-10-25 10:40:21,109 epoch 4 - iter 462/773 - loss 0.02834569 - time (sec): 28.08 - samples/sec: 2640.19 - lr: 0.000021 - momentum: 0.000000
128
+ 2023-10-25 10:40:26,055 epoch 4 - iter 539/773 - loss 0.02860700 - time (sec): 33.02 - samples/sec: 2629.75 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-25 10:40:30,766 epoch 4 - iter 616/773 - loss 0.02820990 - time (sec): 37.73 - samples/sec: 2639.20 - lr: 0.000021 - momentum: 0.000000
130
+ 2023-10-25 10:40:35,442 epoch 4 - iter 693/773 - loss 0.02789789 - time (sec): 42.41 - samples/sec: 2648.82 - lr: 0.000020 - momentum: 0.000000
131
+ 2023-10-25 10:40:39,951 epoch 4 - iter 770/773 - loss 0.02986146 - time (sec): 46.92 - samples/sec: 2638.85 - lr: 0.000020 - momentum: 0.000000
132
+ 2023-10-25 10:40:40,133 ----------------------------------------------------------------------------------------------------
133
+ 2023-10-25 10:40:40,133 EPOCH 4 done: loss 0.0299 - lr: 0.000020
134
+ 2023-10-25 10:40:42,825 DEV : loss 0.08224356174468994 - f1-score (micro avg) 0.7658
135
+ 2023-10-25 10:40:42,842 ----------------------------------------------------------------------------------------------------
136
+ 2023-10-25 10:40:47,542 epoch 5 - iter 77/773 - loss 0.02345774 - time (sec): 4.70 - samples/sec: 2619.37 - lr: 0.000020 - momentum: 0.000000
137
+ 2023-10-25 10:40:52,241 epoch 5 - iter 154/773 - loss 0.02122391 - time (sec): 9.40 - samples/sec: 2625.76 - lr: 0.000019 - momentum: 0.000000
138
+ 2023-10-25 10:40:56,970 epoch 5 - iter 231/773 - loss 0.01984080 - time (sec): 14.13 - samples/sec: 2661.60 - lr: 0.000019 - momentum: 0.000000
139
+ 2023-10-25 10:41:01,364 epoch 5 - iter 308/773 - loss 0.02263347 - time (sec): 18.52 - samples/sec: 2686.02 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-25 10:41:06,019 epoch 5 - iter 385/773 - loss 0.02194096 - time (sec): 23.18 - samples/sec: 2706.00 - lr: 0.000018 - momentum: 0.000000
141
+ 2023-10-25 10:41:10,705 epoch 5 - iter 462/773 - loss 0.02179624 - time (sec): 27.86 - samples/sec: 2708.53 - lr: 0.000018 - momentum: 0.000000
142
+ 2023-10-25 10:41:15,249 epoch 5 - iter 539/773 - loss 0.02057243 - time (sec): 32.41 - samples/sec: 2722.04 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-25 10:41:19,743 epoch 5 - iter 616/773 - loss 0.02059981 - time (sec): 36.90 - samples/sec: 2702.67 - lr: 0.000017 - momentum: 0.000000
144
+ 2023-10-25 10:41:24,238 epoch 5 - iter 693/773 - loss 0.02048126 - time (sec): 41.39 - samples/sec: 2712.66 - lr: 0.000017 - momentum: 0.000000
145
+ 2023-10-25 10:41:28,644 epoch 5 - iter 770/773 - loss 0.02099495 - time (sec): 45.80 - samples/sec: 2703.67 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-25 10:41:28,839 ----------------------------------------------------------------------------------------------------
147
+ 2023-10-25 10:41:28,840 EPOCH 5 done: loss 0.0212 - lr: 0.000017
148
+ 2023-10-25 10:41:31,552 DEV : loss 0.09945573657751083 - f1-score (micro avg) 0.781
149
+ 2023-10-25 10:41:31,572 ----------------------------------------------------------------------------------------------------
150
+ 2023-10-25 10:41:36,179 epoch 6 - iter 77/773 - loss 0.01534591 - time (sec): 4.60 - samples/sec: 2745.09 - lr: 0.000016 - momentum: 0.000000
151
+ 2023-10-25 10:41:40,809 epoch 6 - iter 154/773 - loss 0.01519805 - time (sec): 9.24 - samples/sec: 2722.23 - lr: 0.000016 - momentum: 0.000000
152
+ 2023-10-25 10:41:45,397 epoch 6 - iter 231/773 - loss 0.01464584 - time (sec): 13.82 - samples/sec: 2671.47 - lr: 0.000016 - momentum: 0.000000
153
+ 2023-10-25 10:41:50,157 epoch 6 - iter 308/773 - loss 0.01414850 - time (sec): 18.58 - samples/sec: 2679.03 - lr: 0.000015 - momentum: 0.000000
154
+ 2023-10-25 10:41:54,914 epoch 6 - iter 385/773 - loss 0.01433663 - time (sec): 23.34 - samples/sec: 2701.85 - lr: 0.000015 - momentum: 0.000000
155
+ 2023-10-25 10:41:59,690 epoch 6 - iter 462/773 - loss 0.01277122 - time (sec): 28.12 - samples/sec: 2696.50 - lr: 0.000015 - momentum: 0.000000
156
+ 2023-10-25 10:42:04,329 epoch 6 - iter 539/773 - loss 0.01360655 - time (sec): 32.76 - samples/sec: 2682.46 - lr: 0.000014 - momentum: 0.000000
157
+ 2023-10-25 10:42:09,181 epoch 6 - iter 616/773 - loss 0.01363450 - time (sec): 37.61 - samples/sec: 2648.33 - lr: 0.000014 - momentum: 0.000000
158
+ 2023-10-25 10:42:14,051 epoch 6 - iter 693/773 - loss 0.01353720 - time (sec): 42.48 - samples/sec: 2630.62 - lr: 0.000014 - momentum: 0.000000
159
+ 2023-10-25 10:42:18,779 epoch 6 - iter 770/773 - loss 0.01363499 - time (sec): 47.21 - samples/sec: 2624.71 - lr: 0.000013 - momentum: 0.000000
160
+ 2023-10-25 10:42:18,961 ----------------------------------------------------------------------------------------------------
161
+ 2023-10-25 10:42:18,962 EPOCH 6 done: loss 0.0140 - lr: 0.000013
162
+ 2023-10-25 10:42:22,522 DEV : loss 0.11278796941041946 - f1-score (micro avg) 0.7753
163
+ 2023-10-25 10:42:22,540 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-25 10:42:27,293 epoch 7 - iter 77/773 - loss 0.00960456 - time (sec): 4.75 - samples/sec: 2680.96 - lr: 0.000013 - momentum: 0.000000
165
+ 2023-10-25 10:42:32,004 epoch 7 - iter 154/773 - loss 0.00936374 - time (sec): 9.46 - samples/sec: 2626.52 - lr: 0.000013 - momentum: 0.000000
166
+ 2023-10-25 10:42:36,826 epoch 7 - iter 231/773 - loss 0.00747500 - time (sec): 14.28 - samples/sec: 2713.22 - lr: 0.000012 - momentum: 0.000000
167
+ 2023-10-25 10:42:41,284 epoch 7 - iter 308/773 - loss 0.00789143 - time (sec): 18.74 - samples/sec: 2647.02 - lr: 0.000012 - momentum: 0.000000
168
+ 2023-10-25 10:42:45,927 epoch 7 - iter 385/773 - loss 0.00801181 - time (sec): 23.39 - samples/sec: 2644.58 - lr: 0.000012 - momentum: 0.000000
169
+ 2023-10-25 10:42:50,560 epoch 7 - iter 462/773 - loss 0.00730589 - time (sec): 28.02 - samples/sec: 2658.91 - lr: 0.000011 - momentum: 0.000000
170
+ 2023-10-25 10:42:55,182 epoch 7 - iter 539/773 - loss 0.00808199 - time (sec): 32.64 - samples/sec: 2631.74 - lr: 0.000011 - momentum: 0.000000
171
+ 2023-10-25 10:42:59,804 epoch 7 - iter 616/773 - loss 0.00863132 - time (sec): 37.26 - samples/sec: 2626.61 - lr: 0.000011 - momentum: 0.000000
172
+ 2023-10-25 10:43:04,706 epoch 7 - iter 693/773 - loss 0.00876967 - time (sec): 42.16 - samples/sec: 2631.43 - lr: 0.000010 - momentum: 0.000000
173
+ 2023-10-25 10:43:09,406 epoch 7 - iter 770/773 - loss 0.00915212 - time (sec): 46.86 - samples/sec: 2639.82 - lr: 0.000010 - momentum: 0.000000
174
+ 2023-10-25 10:43:09,584 ----------------------------------------------------------------------------------------------------
175
+ 2023-10-25 10:43:09,584 EPOCH 7 done: loss 0.0091 - lr: 0.000010
176
+ 2023-10-25 10:43:12,629 DEV : loss 0.11861388385295868 - f1-score (micro avg) 0.7724
177
+ 2023-10-25 10:43:12,647 ----------------------------------------------------------------------------------------------------
178
+ 2023-10-25 10:43:17,325 epoch 8 - iter 77/773 - loss 0.00461380 - time (sec): 4.68 - samples/sec: 2636.24 - lr: 0.000010 - momentum: 0.000000
179
+ 2023-10-25 10:43:21,940 epoch 8 - iter 154/773 - loss 0.00702621 - time (sec): 9.29 - samples/sec: 2672.72 - lr: 0.000009 - momentum: 0.000000
180
+ 2023-10-25 10:43:26,655 epoch 8 - iter 231/773 - loss 0.00812508 - time (sec): 14.01 - samples/sec: 2598.23 - lr: 0.000009 - momentum: 0.000000
181
+ 2023-10-25 10:43:31,362 epoch 8 - iter 308/773 - loss 0.00661780 - time (sec): 18.71 - samples/sec: 2595.72 - lr: 0.000009 - momentum: 0.000000
182
+ 2023-10-25 10:43:35,901 epoch 8 - iter 385/773 - loss 0.00640004 - time (sec): 23.25 - samples/sec: 2629.17 - lr: 0.000008 - momentum: 0.000000
183
+ 2023-10-25 10:43:40,469 epoch 8 - iter 462/773 - loss 0.00633717 - time (sec): 27.82 - samples/sec: 2675.90 - lr: 0.000008 - momentum: 0.000000
184
+ 2023-10-25 10:43:45,129 epoch 8 - iter 539/773 - loss 0.00653842 - time (sec): 32.48 - samples/sec: 2677.43 - lr: 0.000008 - momentum: 0.000000
185
+ 2023-10-25 10:43:49,745 epoch 8 - iter 616/773 - loss 0.00732939 - time (sec): 37.10 - samples/sec: 2673.52 - lr: 0.000007 - momentum: 0.000000
186
+ 2023-10-25 10:43:54,376 epoch 8 - iter 693/773 - loss 0.00709463 - time (sec): 41.73 - samples/sec: 2665.60 - lr: 0.000007 - momentum: 0.000000
187
+ 2023-10-25 10:43:59,079 epoch 8 - iter 770/773 - loss 0.00668988 - time (sec): 46.43 - samples/sec: 2665.10 - lr: 0.000007 - momentum: 0.000000
188
+ 2023-10-25 10:43:59,266 ----------------------------------------------------------------------------------------------------
189
+ 2023-10-25 10:43:59,266 EPOCH 8 done: loss 0.0067 - lr: 0.000007
190
+ 2023-10-25 10:44:02,424 DEV : loss 0.10935225337743759 - f1-score (micro avg) 0.7901
191
+ 2023-10-25 10:44:02,442 ----------------------------------------------------------------------------------------------------
192
+ 2023-10-25 10:44:07,211 epoch 9 - iter 77/773 - loss 0.00280222 - time (sec): 4.77 - samples/sec: 2641.93 - lr: 0.000006 - momentum: 0.000000
193
+ 2023-10-25 10:44:11,756 epoch 9 - iter 154/773 - loss 0.00290731 - time (sec): 9.31 - samples/sec: 2687.34 - lr: 0.000006 - momentum: 0.000000
194
+ 2023-10-25 10:44:16,394 epoch 9 - iter 231/773 - loss 0.00405151 - time (sec): 13.95 - samples/sec: 2681.13 - lr: 0.000006 - momentum: 0.000000
195
+ 2023-10-25 10:44:21,122 epoch 9 - iter 308/773 - loss 0.00435854 - time (sec): 18.68 - samples/sec: 2697.32 - lr: 0.000005 - momentum: 0.000000
196
+ 2023-10-25 10:44:25,805 epoch 9 - iter 385/773 - loss 0.00434929 - time (sec): 23.36 - samples/sec: 2682.52 - lr: 0.000005 - momentum: 0.000000
197
+ 2023-10-25 10:44:30,607 epoch 9 - iter 462/773 - loss 0.00405141 - time (sec): 28.16 - samples/sec: 2661.17 - lr: 0.000005 - momentum: 0.000000
198
+ 2023-10-25 10:44:35,355 epoch 9 - iter 539/773 - loss 0.00398165 - time (sec): 32.91 - samples/sec: 2641.67 - lr: 0.000004 - momentum: 0.000000
199
+ 2023-10-25 10:44:40,206 epoch 9 - iter 616/773 - loss 0.00428787 - time (sec): 37.76 - samples/sec: 2622.47 - lr: 0.000004 - momentum: 0.000000
200
+ 2023-10-25 10:44:44,909 epoch 9 - iter 693/773 - loss 0.00416455 - time (sec): 42.47 - samples/sec: 2642.13 - lr: 0.000004 - momentum: 0.000000
201
+ 2023-10-25 10:44:49,527 epoch 9 - iter 770/773 - loss 0.00393684 - time (sec): 47.08 - samples/sec: 2633.23 - lr: 0.000003 - momentum: 0.000000
202
+ 2023-10-25 10:44:49,707 ----------------------------------------------------------------------------------------------------
203
+ 2023-10-25 10:44:49,707 EPOCH 9 done: loss 0.0039 - lr: 0.000003
204
+ 2023-10-25 10:44:52,325 DEV : loss 0.11163745075464249 - f1-score (micro avg) 0.7942
205
+ 2023-10-25 10:44:52,342 ----------------------------------------------------------------------------------------------------
206
+ 2023-10-25 10:44:57,023 epoch 10 - iter 77/773 - loss 0.00160620 - time (sec): 4.68 - samples/sec: 2530.57 - lr: 0.000003 - momentum: 0.000000
207
+ 2023-10-25 10:45:01,740 epoch 10 - iter 154/773 - loss 0.00218583 - time (sec): 9.40 - samples/sec: 2501.82 - lr: 0.000003 - momentum: 0.000000
208
+ 2023-10-25 10:45:06,385 epoch 10 - iter 231/773 - loss 0.00253136 - time (sec): 14.04 - samples/sec: 2524.83 - lr: 0.000002 - momentum: 0.000000
209
+ 2023-10-25 10:45:10,823 epoch 10 - iter 308/773 - loss 0.00321432 - time (sec): 18.48 - samples/sec: 2583.04 - lr: 0.000002 - momentum: 0.000000
210
+ 2023-10-25 10:45:15,327 epoch 10 - iter 385/773 - loss 0.00292953 - time (sec): 22.98 - samples/sec: 2578.23 - lr: 0.000002 - momentum: 0.000000
211
+ 2023-10-25 10:45:20,048 epoch 10 - iter 462/773 - loss 0.00283533 - time (sec): 27.70 - samples/sec: 2604.20 - lr: 0.000001 - momentum: 0.000000
212
+ 2023-10-25 10:45:24,719 epoch 10 - iter 539/773 - loss 0.00275001 - time (sec): 32.37 - samples/sec: 2628.38 - lr: 0.000001 - momentum: 0.000000
213
+ 2023-10-25 10:45:29,618 epoch 10 - iter 616/773 - loss 0.00267312 - time (sec): 37.27 - samples/sec: 2636.23 - lr: 0.000001 - momentum: 0.000000
214
+ 2023-10-25 10:45:34,364 epoch 10 - iter 693/773 - loss 0.00242572 - time (sec): 42.02 - samples/sec: 2648.91 - lr: 0.000000 - momentum: 0.000000
215
+ 2023-10-25 10:45:39,113 epoch 10 - iter 770/773 - loss 0.00279658 - time (sec): 46.77 - samples/sec: 2642.08 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-10-25 10:45:39,310 ----------------------------------------------------------------------------------------------------
217
+ 2023-10-25 10:45:39,310 EPOCH 10 done: loss 0.0029 - lr: 0.000000
218
+ 2023-10-25 10:45:42,349 DEV : loss 0.11523404717445374 - f1-score (micro avg) 0.7884
219
+ 2023-10-25 10:45:43,297 ----------------------------------------------------------------------------------------------------
220
+ 2023-10-25 10:45:43,299 Loading model from best epoch ...
221
+ 2023-10-25 10:45:45,437 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
222
+ 2023-10-25 10:45:55,712
223
+ Results:
224
+ - F-score (micro) 0.7656
225
+ - F-score (macro) 0.6513
226
+ - Accuracy 0.641
227
+
228
+ By class:
229
+ precision recall f1-score support
230
+
231
+ LOC 0.8262 0.8140 0.8200 946
232
+ BUILDING 0.5258 0.5514 0.5383 185
233
+ STREET 0.7368 0.5000 0.5957 56
234
+
235
+ micro avg 0.7732 0.7582 0.7656 1187
236
+ macro avg 0.6963 0.6218 0.6513 1187
237
+ weighted avg 0.7751 0.7582 0.7655 1187
238
+
239
+ 2023-10-25 10:45:55,712 ----------------------------------------------------------------------------------------------------