stefan-it commited on
Commit
39a1928
·
1 Parent(s): 15c7f20

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:89c68ee43b8af8aa6a3cf577a9952d47d4f707d3f4eabfe12a222b893cde1df0
3
+ size 19045986
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 09:56:41 0.0000 0.9134 0.1172 0.5000 0.0127 0.0247 0.0125
3
+ 2 09:57:06 0.0000 0.1797 0.0957 0.6457 0.3460 0.4505 0.2982
4
+ 3 09:57:31 0.0000 0.1500 0.0935 0.5990 0.4979 0.5438 0.3856
5
+ 4 09:57:56 0.0000 0.1335 0.0891 0.5781 0.5781 0.5781 0.4228
6
+ 5 09:58:21 0.0000 0.1245 0.0944 0.6394 0.5612 0.5978 0.4448
7
+ 6 09:58:46 0.0000 0.1158 0.0926 0.6636 0.6076 0.6344 0.4848
8
+ 7 09:59:11 0.0000 0.1098 0.0989 0.6134 0.6160 0.6147 0.4650
9
+ 8 09:59:36 0.0000 0.1048 0.1057 0.6234 0.6287 0.6261 0.4745
10
+ 9 10:00:01 0.0000 0.1004 0.1049 0.6771 0.6371 0.6565 0.5050
11
+ 10 10:00:25 0.0000 0.0962 0.1053 0.6524 0.6414 0.6468 0.4935
runs/events.out.tfevents.1697795777.46dc0c540dd0.5704.13 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f249ce88c4653c4386ebf888bdd00354cc746bbfa5bdf74aff050f8729f6653
3
+ size 864636
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,244 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-20 09:56:17,330 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-20 09:56:17,330 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 128)
7
+ (position_embeddings): Embedding(512, 128)
8
+ (token_type_embeddings): Embedding(2, 128)
9
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-1): 2 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=128, out_features=128, bias=True)
18
+ (key): Linear(in_features=128, out_features=128, bias=True)
19
+ (value): Linear(in_features=128, out_features=128, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=128, out_features=128, bias=True)
24
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=128, out_features=512, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=512, out_features=128, bias=True)
34
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=128, out_features=128, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=128, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-20 09:56:17,330 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-20 09:56:17,330 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
52
+ - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
53
+ 2023-10-20 09:56:17,330 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-20 09:56:17,330 Train: 6183 sentences
55
+ 2023-10-20 09:56:17,330 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-20 09:56:17,330 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-20 09:56:17,330 Training Params:
58
+ 2023-10-20 09:56:17,330 - learning_rate: "5e-05"
59
+ 2023-10-20 09:56:17,330 - mini_batch_size: "4"
60
+ 2023-10-20 09:56:17,330 - max_epochs: "10"
61
+ 2023-10-20 09:56:17,330 - shuffle: "True"
62
+ 2023-10-20 09:56:17,330 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-20 09:56:17,330 Plugins:
64
+ 2023-10-20 09:56:17,331 - TensorboardLogger
65
+ 2023-10-20 09:56:17,331 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-20 09:56:17,331 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-20 09:56:17,331 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-20 09:56:17,331 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-20 09:56:17,331 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-20 09:56:17,331 Computation:
71
+ 2023-10-20 09:56:17,331 - compute on device: cuda:0
72
+ 2023-10-20 09:56:17,331 - embedding storage: none
73
+ 2023-10-20 09:56:17,331 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-20 09:56:17,331 Model training base path: "hmbench-topres19th/en-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-4"
75
+ 2023-10-20 09:56:17,331 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-20 09:56:17,331 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-20 09:56:17,331 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-20 09:56:19,693 epoch 1 - iter 154/1546 - loss 3.62102707 - time (sec): 2.36 - samples/sec: 5236.15 - lr: 0.000005 - momentum: 0.000000
79
+ 2023-10-20 09:56:22,114 epoch 1 - iter 308/1546 - loss 3.13477517 - time (sec): 4.78 - samples/sec: 5182.14 - lr: 0.000010 - momentum: 0.000000
80
+ 2023-10-20 09:56:24,433 epoch 1 - iter 462/1546 - loss 2.46338633 - time (sec): 7.10 - samples/sec: 5146.78 - lr: 0.000015 - momentum: 0.000000
81
+ 2023-10-20 09:56:26,539 epoch 1 - iter 616/1546 - loss 1.93374896 - time (sec): 9.21 - samples/sec: 5337.75 - lr: 0.000020 - momentum: 0.000000
82
+ 2023-10-20 09:56:28,673 epoch 1 - iter 770/1546 - loss 1.61995330 - time (sec): 11.34 - samples/sec: 5337.89 - lr: 0.000025 - momentum: 0.000000
83
+ 2023-10-20 09:56:30,909 epoch 1 - iter 924/1546 - loss 1.39359053 - time (sec): 13.58 - samples/sec: 5362.23 - lr: 0.000030 - momentum: 0.000000
84
+ 2023-10-20 09:56:33,199 epoch 1 - iter 1078/1546 - loss 1.23035252 - time (sec): 15.87 - samples/sec: 5350.81 - lr: 0.000035 - momentum: 0.000000
85
+ 2023-10-20 09:56:35,625 epoch 1 - iter 1232/1546 - loss 1.09712317 - time (sec): 18.29 - samples/sec: 5368.14 - lr: 0.000040 - momentum: 0.000000
86
+ 2023-10-20 09:56:37,924 epoch 1 - iter 1386/1546 - loss 0.99421232 - time (sec): 20.59 - samples/sec: 5403.80 - lr: 0.000045 - momentum: 0.000000
87
+ 2023-10-20 09:56:40,172 epoch 1 - iter 1540/1546 - loss 0.91571471 - time (sec): 22.84 - samples/sec: 5422.50 - lr: 0.000050 - momentum: 0.000000
88
+ 2023-10-20 09:56:40,282 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-20 09:56:40,282 EPOCH 1 done: loss 0.9134 - lr: 0.000050
90
+ 2023-10-20 09:56:41,270 DEV : loss 0.11716283857822418 - f1-score (micro avg) 0.0247
91
+ 2023-10-20 09:56:41,282 saving best model
92
+ 2023-10-20 09:56:41,311 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-20 09:56:43,719 epoch 2 - iter 154/1546 - loss 0.19306759 - time (sec): 2.41 - samples/sec: 5487.96 - lr: 0.000049 - momentum: 0.000000
94
+ 2023-10-20 09:56:46,138 epoch 2 - iter 308/1546 - loss 0.19527201 - time (sec): 4.83 - samples/sec: 5486.99 - lr: 0.000049 - momentum: 0.000000
95
+ 2023-10-20 09:56:48,446 epoch 2 - iter 462/1546 - loss 0.19787880 - time (sec): 7.13 - samples/sec: 5214.32 - lr: 0.000048 - momentum: 0.000000
96
+ 2023-10-20 09:56:50,833 epoch 2 - iter 616/1546 - loss 0.19678261 - time (sec): 9.52 - samples/sec: 5154.17 - lr: 0.000048 - momentum: 0.000000
97
+ 2023-10-20 09:56:53,210 epoch 2 - iter 770/1546 - loss 0.19605129 - time (sec): 11.90 - samples/sec: 5121.25 - lr: 0.000047 - momentum: 0.000000
98
+ 2023-10-20 09:56:55,564 epoch 2 - iter 924/1546 - loss 0.19154674 - time (sec): 14.25 - samples/sec: 5156.29 - lr: 0.000047 - momentum: 0.000000
99
+ 2023-10-20 09:56:57,968 epoch 2 - iter 1078/1546 - loss 0.18535945 - time (sec): 16.66 - samples/sec: 5202.28 - lr: 0.000046 - momentum: 0.000000
100
+ 2023-10-20 09:57:00,295 epoch 2 - iter 1232/1546 - loss 0.18750537 - time (sec): 18.98 - samples/sec: 5175.91 - lr: 0.000046 - momentum: 0.000000
101
+ 2023-10-20 09:57:02,650 epoch 2 - iter 1386/1546 - loss 0.18188318 - time (sec): 21.34 - samples/sec: 5162.78 - lr: 0.000045 - momentum: 0.000000
102
+ 2023-10-20 09:57:05,066 epoch 2 - iter 1540/1546 - loss 0.18010251 - time (sec): 23.75 - samples/sec: 5209.21 - lr: 0.000044 - momentum: 0.000000
103
+ 2023-10-20 09:57:05,155 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-20 09:57:05,155 EPOCH 2 done: loss 0.1797 - lr: 0.000044
105
+ 2023-10-20 09:57:06,238 DEV : loss 0.09568006545305252 - f1-score (micro avg) 0.4505
106
+ 2023-10-20 09:57:06,250 saving best model
107
+ 2023-10-20 09:57:06,289 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-20 09:57:08,762 epoch 3 - iter 154/1546 - loss 0.15246759 - time (sec): 2.47 - samples/sec: 5045.79 - lr: 0.000044 - momentum: 0.000000
109
+ 2023-10-20 09:57:11,131 epoch 3 - iter 308/1546 - loss 0.15171707 - time (sec): 4.84 - samples/sec: 4969.29 - lr: 0.000043 - momentum: 0.000000
110
+ 2023-10-20 09:57:13,473 epoch 3 - iter 462/1546 - loss 0.14819470 - time (sec): 7.18 - samples/sec: 5146.44 - lr: 0.000043 - momentum: 0.000000
111
+ 2023-10-20 09:57:15,811 epoch 3 - iter 616/1546 - loss 0.14757401 - time (sec): 9.52 - samples/sec: 5193.61 - lr: 0.000042 - momentum: 0.000000
112
+ 2023-10-20 09:57:18,159 epoch 3 - iter 770/1546 - loss 0.14865086 - time (sec): 11.87 - samples/sec: 5251.53 - lr: 0.000042 - momentum: 0.000000
113
+ 2023-10-20 09:57:20,503 epoch 3 - iter 924/1546 - loss 0.15129778 - time (sec): 14.21 - samples/sec: 5149.30 - lr: 0.000041 - momentum: 0.000000
114
+ 2023-10-20 09:57:22,888 epoch 3 - iter 1078/1546 - loss 0.14983046 - time (sec): 16.60 - samples/sec: 5184.46 - lr: 0.000041 - momentum: 0.000000
115
+ 2023-10-20 09:57:25,248 epoch 3 - iter 1232/1546 - loss 0.14979943 - time (sec): 18.96 - samples/sec: 5220.74 - lr: 0.000040 - momentum: 0.000000
116
+ 2023-10-20 09:57:27,623 epoch 3 - iter 1386/1546 - loss 0.15151253 - time (sec): 21.33 - samples/sec: 5212.66 - lr: 0.000039 - momentum: 0.000000
117
+ 2023-10-20 09:57:30,024 epoch 3 - iter 1540/1546 - loss 0.14973869 - time (sec): 23.73 - samples/sec: 5220.95 - lr: 0.000039 - momentum: 0.000000
118
+ 2023-10-20 09:57:30,119 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-20 09:57:30,119 EPOCH 3 done: loss 0.1500 - lr: 0.000039
120
+ 2023-10-20 09:57:31,211 DEV : loss 0.09354293346405029 - f1-score (micro avg) 0.5438
121
+ 2023-10-20 09:57:31,223 saving best model
122
+ 2023-10-20 09:57:31,257 ----------------------------------------------------------------------------------------------------
123
+ 2023-10-20 09:57:33,586 epoch 4 - iter 154/1546 - loss 0.15714911 - time (sec): 2.33 - samples/sec: 5661.53 - lr: 0.000038 - momentum: 0.000000
124
+ 2023-10-20 09:57:35,970 epoch 4 - iter 308/1546 - loss 0.14974834 - time (sec): 4.71 - samples/sec: 5257.99 - lr: 0.000038 - momentum: 0.000000
125
+ 2023-10-20 09:57:38,295 epoch 4 - iter 462/1546 - loss 0.14721188 - time (sec): 7.04 - samples/sec: 5145.14 - lr: 0.000037 - momentum: 0.000000
126
+ 2023-10-20 09:57:40,675 epoch 4 - iter 616/1546 - loss 0.14478087 - time (sec): 9.42 - samples/sec: 5248.76 - lr: 0.000037 - momentum: 0.000000
127
+ 2023-10-20 09:57:43,001 epoch 4 - iter 770/1546 - loss 0.14146413 - time (sec): 11.74 - samples/sec: 5205.67 - lr: 0.000036 - momentum: 0.000000
128
+ 2023-10-20 09:57:45,386 epoch 4 - iter 924/1546 - loss 0.13681608 - time (sec): 14.13 - samples/sec: 5194.75 - lr: 0.000036 - momentum: 0.000000
129
+ 2023-10-20 09:57:47,777 epoch 4 - iter 1078/1546 - loss 0.13568701 - time (sec): 16.52 - samples/sec: 5199.00 - lr: 0.000035 - momentum: 0.000000
130
+ 2023-10-20 09:57:50,225 epoch 4 - iter 1232/1546 - loss 0.13598660 - time (sec): 18.97 - samples/sec: 5200.73 - lr: 0.000034 - momentum: 0.000000
131
+ 2023-10-20 09:57:52,616 epoch 4 - iter 1386/1546 - loss 0.13476872 - time (sec): 21.36 - samples/sec: 5213.71 - lr: 0.000034 - momentum: 0.000000
132
+ 2023-10-20 09:57:54,877 epoch 4 - iter 1540/1546 - loss 0.13332730 - time (sec): 23.62 - samples/sec: 5236.93 - lr: 0.000033 - momentum: 0.000000
133
+ 2023-10-20 09:57:54,964 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-20 09:57:54,965 EPOCH 4 done: loss 0.1335 - lr: 0.000033
135
+ 2023-10-20 09:57:56,336 DEV : loss 0.08910585939884186 - f1-score (micro avg) 0.5781
136
+ 2023-10-20 09:57:56,348 saving best model
137
+ 2023-10-20 09:57:56,389 ----------------------------------------------------------------------------------------------------
138
+ 2023-10-20 09:57:58,684 epoch 5 - iter 154/1546 - loss 0.11074956 - time (sec): 2.29 - samples/sec: 5207.64 - lr: 0.000033 - momentum: 0.000000
139
+ 2023-10-20 09:58:01,079 epoch 5 - iter 308/1546 - loss 0.12438368 - time (sec): 4.69 - samples/sec: 5411.83 - lr: 0.000032 - momentum: 0.000000
140
+ 2023-10-20 09:58:03,466 epoch 5 - iter 462/1546 - loss 0.13150006 - time (sec): 7.08 - samples/sec: 5360.25 - lr: 0.000032 - momentum: 0.000000
141
+ 2023-10-20 09:58:05,767 epoch 5 - iter 616/1546 - loss 0.12690522 - time (sec): 9.38 - samples/sec: 5263.95 - lr: 0.000031 - momentum: 0.000000
142
+ 2023-10-20 09:58:08,159 epoch 5 - iter 770/1546 - loss 0.12550895 - time (sec): 11.77 - samples/sec: 5297.28 - lr: 0.000031 - momentum: 0.000000
143
+ 2023-10-20 09:58:10,480 epoch 5 - iter 924/1546 - loss 0.12198991 - time (sec): 14.09 - samples/sec: 5273.19 - lr: 0.000030 - momentum: 0.000000
144
+ 2023-10-20 09:58:12,851 epoch 5 - iter 1078/1546 - loss 0.12094948 - time (sec): 16.46 - samples/sec: 5285.15 - lr: 0.000029 - momentum: 0.000000
145
+ 2023-10-20 09:58:15,325 epoch 5 - iter 1232/1546 - loss 0.12368525 - time (sec): 18.94 - samples/sec: 5235.62 - lr: 0.000029 - momentum: 0.000000
146
+ 2023-10-20 09:58:17,724 epoch 5 - iter 1386/1546 - loss 0.12543156 - time (sec): 21.33 - samples/sec: 5235.20 - lr: 0.000028 - momentum: 0.000000
147
+ 2023-10-20 09:58:20,123 epoch 5 - iter 1540/1546 - loss 0.12463983 - time (sec): 23.73 - samples/sec: 5213.99 - lr: 0.000028 - momentum: 0.000000
148
+ 2023-10-20 09:58:20,217 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-20 09:58:20,217 EPOCH 5 done: loss 0.1245 - lr: 0.000028
150
+ 2023-10-20 09:58:21,313 DEV : loss 0.09435312449932098 - f1-score (micro avg) 0.5978
151
+ 2023-10-20 09:58:21,326 saving best model
152
+ 2023-10-20 09:58:21,360 ----------------------------------------------------------------------------------------------------
153
+ 2023-10-20 09:58:23,747 epoch 6 - iter 154/1546 - loss 0.11053617 - time (sec): 2.39 - samples/sec: 4737.34 - lr: 0.000027 - momentum: 0.000000
154
+ 2023-10-20 09:58:26,136 epoch 6 - iter 308/1546 - loss 0.11721437 - time (sec): 4.78 - samples/sec: 4972.77 - lr: 0.000027 - momentum: 0.000000
155
+ 2023-10-20 09:58:28,544 epoch 6 - iter 462/1546 - loss 0.11731031 - time (sec): 7.18 - samples/sec: 5050.51 - lr: 0.000026 - momentum: 0.000000
156
+ 2023-10-20 09:58:30,911 epoch 6 - iter 616/1546 - loss 0.11093231 - time (sec): 9.55 - samples/sec: 5101.05 - lr: 0.000026 - momentum: 0.000000
157
+ 2023-10-20 09:58:33,258 epoch 6 - iter 770/1546 - loss 0.11841360 - time (sec): 11.90 - samples/sec: 5111.39 - lr: 0.000025 - momentum: 0.000000
158
+ 2023-10-20 09:58:35,581 epoch 6 - iter 924/1546 - loss 0.11998999 - time (sec): 14.22 - samples/sec: 5129.85 - lr: 0.000024 - momentum: 0.000000
159
+ 2023-10-20 09:58:37,939 epoch 6 - iter 1078/1546 - loss 0.11704457 - time (sec): 16.58 - samples/sec: 5155.77 - lr: 0.000024 - momentum: 0.000000
160
+ 2023-10-20 09:58:40,326 epoch 6 - iter 1232/1546 - loss 0.11636736 - time (sec): 18.97 - samples/sec: 5193.59 - lr: 0.000023 - momentum: 0.000000
161
+ 2023-10-20 09:58:42,726 epoch 6 - iter 1386/1546 - loss 0.11459251 - time (sec): 21.37 - samples/sec: 5200.43 - lr: 0.000023 - momentum: 0.000000
162
+ 2023-10-20 09:58:45,072 epoch 6 - iter 1540/1546 - loss 0.11638062 - time (sec): 23.71 - samples/sec: 5211.77 - lr: 0.000022 - momentum: 0.000000
163
+ 2023-10-20 09:58:45,178 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-20 09:58:45,179 EPOCH 6 done: loss 0.1158 - lr: 0.000022
165
+ 2023-10-20 09:58:46,284 DEV : loss 0.09260376542806625 - f1-score (micro avg) 0.6344
166
+ 2023-10-20 09:58:46,296 saving best model
167
+ 2023-10-20 09:58:46,333 ----------------------------------------------------------------------------------------------------
168
+ 2023-10-20 09:58:48,697 epoch 7 - iter 154/1546 - loss 0.10982928 - time (sec): 2.36 - samples/sec: 5341.51 - lr: 0.000022 - momentum: 0.000000
169
+ 2023-10-20 09:58:51,116 epoch 7 - iter 308/1546 - loss 0.10926222 - time (sec): 4.78 - samples/sec: 5195.27 - lr: 0.000021 - momentum: 0.000000
170
+ 2023-10-20 09:58:53,492 epoch 7 - iter 462/1546 - loss 0.10878928 - time (sec): 7.16 - samples/sec: 5171.38 - lr: 0.000021 - momentum: 0.000000
171
+ 2023-10-20 09:58:55,841 epoch 7 - iter 616/1546 - loss 0.11396563 - time (sec): 9.51 - samples/sec: 5221.24 - lr: 0.000020 - momentum: 0.000000
172
+ 2023-10-20 09:58:58,270 epoch 7 - iter 770/1546 - loss 0.11096524 - time (sec): 11.94 - samples/sec: 5316.49 - lr: 0.000019 - momentum: 0.000000
173
+ 2023-10-20 09:59:00,633 epoch 7 - iter 924/1546 - loss 0.10959747 - time (sec): 14.30 - samples/sec: 5299.06 - lr: 0.000019 - momentum: 0.000000
174
+ 2023-10-20 09:59:03,007 epoch 7 - iter 1078/1546 - loss 0.11069285 - time (sec): 16.67 - samples/sec: 5260.74 - lr: 0.000018 - momentum: 0.000000
175
+ 2023-10-20 09:59:05,348 epoch 7 - iter 1232/1546 - loss 0.11156839 - time (sec): 19.01 - samples/sec: 5246.88 - lr: 0.000018 - momentum: 0.000000
176
+ 2023-10-20 09:59:07,702 epoch 7 - iter 1386/1546 - loss 0.11104971 - time (sec): 21.37 - samples/sec: 5237.77 - lr: 0.000017 - momentum: 0.000000
177
+ 2023-10-20 09:59:10,067 epoch 7 - iter 1540/1546 - loss 0.10993191 - time (sec): 23.73 - samples/sec: 5217.63 - lr: 0.000017 - momentum: 0.000000
178
+ 2023-10-20 09:59:10,167 ----------------------------------------------------------------------------------------------------
179
+ 2023-10-20 09:59:10,167 EPOCH 7 done: loss 0.1098 - lr: 0.000017
180
+ 2023-10-20 09:59:11,270 DEV : loss 0.09885262697935104 - f1-score (micro avg) 0.6147
181
+ 2023-10-20 09:59:11,281 ----------------------------------------------------------------------------------------------------
182
+ 2023-10-20 09:59:13,620 epoch 8 - iter 154/1546 - loss 0.10457091 - time (sec): 2.34 - samples/sec: 5371.89 - lr: 0.000016 - momentum: 0.000000
183
+ 2023-10-20 09:59:16,035 epoch 8 - iter 308/1546 - loss 0.09573930 - time (sec): 4.75 - samples/sec: 5244.20 - lr: 0.000016 - momentum: 0.000000
184
+ 2023-10-20 09:59:18,416 epoch 8 - iter 462/1546 - loss 0.09757855 - time (sec): 7.13 - samples/sec: 5222.20 - lr: 0.000015 - momentum: 0.000000
185
+ 2023-10-20 09:59:20,846 epoch 8 - iter 616/1546 - loss 0.10232804 - time (sec): 9.56 - samples/sec: 5196.35 - lr: 0.000014 - momentum: 0.000000
186
+ 2023-10-20 09:59:23,246 epoch 8 - iter 770/1546 - loss 0.10450911 - time (sec): 11.96 - samples/sec: 5213.17 - lr: 0.000014 - momentum: 0.000000
187
+ 2023-10-20 09:59:25,609 epoch 8 - iter 924/1546 - loss 0.10466820 - time (sec): 14.33 - samples/sec: 5201.41 - lr: 0.000013 - momentum: 0.000000
188
+ 2023-10-20 09:59:27,944 epoch 8 - iter 1078/1546 - loss 0.10690035 - time (sec): 16.66 - samples/sec: 5204.45 - lr: 0.000013 - momentum: 0.000000
189
+ 2023-10-20 09:59:30,324 epoch 8 - iter 1232/1546 - loss 0.10601181 - time (sec): 19.04 - samples/sec: 5206.44 - lr: 0.000012 - momentum: 0.000000
190
+ 2023-10-20 09:59:32,724 epoch 8 - iter 1386/1546 - loss 0.10561856 - time (sec): 21.44 - samples/sec: 5197.08 - lr: 0.000012 - momentum: 0.000000
191
+ 2023-10-20 09:59:35,091 epoch 8 - iter 1540/1546 - loss 0.10482255 - time (sec): 23.81 - samples/sec: 5198.72 - lr: 0.000011 - momentum: 0.000000
192
+ 2023-10-20 09:59:35,188 ----------------------------------------------------------------------------------------------------
193
+ 2023-10-20 09:59:35,188 EPOCH 8 done: loss 0.1048 - lr: 0.000011
194
+ 2023-10-20 09:59:36,269 DEV : loss 0.1057317703962326 - f1-score (micro avg) 0.6261
195
+ 2023-10-20 09:59:36,281 ----------------------------------------------------------------------------------------------------
196
+ 2023-10-20 09:59:38,590 epoch 9 - iter 154/1546 - loss 0.11356698 - time (sec): 2.31 - samples/sec: 5151.09 - lr: 0.000011 - momentum: 0.000000
197
+ 2023-10-20 09:59:40,915 epoch 9 - iter 308/1546 - loss 0.10406692 - time (sec): 4.63 - samples/sec: 5176.55 - lr: 0.000010 - momentum: 0.000000
198
+ 2023-10-20 09:59:43,279 epoch 9 - iter 462/1546 - loss 0.10401387 - time (sec): 7.00 - samples/sec: 5196.44 - lr: 0.000009 - momentum: 0.000000
199
+ 2023-10-20 09:59:45,663 epoch 9 - iter 616/1546 - loss 0.09614905 - time (sec): 9.38 - samples/sec: 5264.89 - lr: 0.000009 - momentum: 0.000000
200
+ 2023-10-20 09:59:48,049 epoch 9 - iter 770/1546 - loss 0.09323588 - time (sec): 11.77 - samples/sec: 5219.67 - lr: 0.000008 - momentum: 0.000000
201
+ 2023-10-20 09:59:50,422 epoch 9 - iter 924/1546 - loss 0.09618676 - time (sec): 14.14 - samples/sec: 5189.75 - lr: 0.000008 - momentum: 0.000000
202
+ 2023-10-20 09:59:52,822 epoch 9 - iter 1078/1546 - loss 0.09686063 - time (sec): 16.54 - samples/sec: 5188.16 - lr: 0.000007 - momentum: 0.000000
203
+ 2023-10-20 09:59:55,194 epoch 9 - iter 1232/1546 - loss 0.09966681 - time (sec): 18.91 - samples/sec: 5213.59 - lr: 0.000007 - momentum: 0.000000
204
+ 2023-10-20 09:59:57,578 epoch 9 - iter 1386/1546 - loss 0.09979067 - time (sec): 21.30 - samples/sec: 5253.50 - lr: 0.000006 - momentum: 0.000000
205
+ 2023-10-20 09:59:59,886 epoch 9 - iter 1540/1546 - loss 0.10020466 - time (sec): 23.60 - samples/sec: 5245.93 - lr: 0.000006 - momentum: 0.000000
206
+ 2023-10-20 09:59:59,978 ----------------------------------------------------------------------------------------------------
207
+ 2023-10-20 09:59:59,978 EPOCH 9 done: loss 0.1004 - lr: 0.000006
208
+ 2023-10-20 10:00:01,073 DEV : loss 0.10489093512296677 - f1-score (micro avg) 0.6565
209
+ 2023-10-20 10:00:01,085 saving best model
210
+ 2023-10-20 10:00:01,123 ----------------------------------------------------------------------------------------------------
211
+ 2023-10-20 10:00:03,228 epoch 10 - iter 154/1546 - loss 0.11356945 - time (sec): 2.10 - samples/sec: 5568.14 - lr: 0.000005 - momentum: 0.000000
212
+ 2023-10-20 10:00:05,574 epoch 10 - iter 308/1546 - loss 0.10220205 - time (sec): 4.45 - samples/sec: 5477.44 - lr: 0.000004 - momentum: 0.000000
213
+ 2023-10-20 10:00:07,947 epoch 10 - iter 462/1546 - loss 0.09862619 - time (sec): 6.82 - samples/sec: 5456.90 - lr: 0.000004 - momentum: 0.000000
214
+ 2023-10-20 10:00:10,344 epoch 10 - iter 616/1546 - loss 0.09815601 - time (sec): 9.22 - samples/sec: 5398.30 - lr: 0.000003 - momentum: 0.000000
215
+ 2023-10-20 10:00:12,711 epoch 10 - iter 770/1546 - loss 0.09919748 - time (sec): 11.59 - samples/sec: 5350.37 - lr: 0.000003 - momentum: 0.000000
216
+ 2023-10-20 10:00:15,063 epoch 10 - iter 924/1546 - loss 0.10078099 - time (sec): 13.94 - samples/sec: 5288.38 - lr: 0.000002 - momentum: 0.000000
217
+ 2023-10-20 10:00:17,441 epoch 10 - iter 1078/1546 - loss 0.10013561 - time (sec): 16.32 - samples/sec: 5287.68 - lr: 0.000002 - momentum: 0.000000
218
+ 2023-10-20 10:00:19,854 epoch 10 - iter 1232/1546 - loss 0.09648546 - time (sec): 18.73 - samples/sec: 5296.75 - lr: 0.000001 - momentum: 0.000000
219
+ 2023-10-20 10:00:22,219 epoch 10 - iter 1386/1546 - loss 0.09661131 - time (sec): 21.10 - samples/sec: 5270.05 - lr: 0.000001 - momentum: 0.000000
220
+ 2023-10-20 10:00:24,599 epoch 10 - iter 1540/1546 - loss 0.09632277 - time (sec): 23.48 - samples/sec: 5267.14 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-20 10:00:24,695 ----------------------------------------------------------------------------------------------------
222
+ 2023-10-20 10:00:24,696 EPOCH 10 done: loss 0.0962 - lr: 0.000000
223
+ 2023-10-20 10:00:25,790 DEV : loss 0.10527437180280685 - f1-score (micro avg) 0.6468
224
+ 2023-10-20 10:00:25,836 ----------------------------------------------------------------------------------------------------
225
+ 2023-10-20 10:00:25,836 Loading model from best epoch ...
226
+ 2023-10-20 10:00:25,912 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
227
+ 2023-10-20 10:00:28,836
228
+ Results:
229
+ - F-score (micro) 0.5986
230
+ - F-score (macro) 0.342
231
+ - Accuracy 0.4367
232
+
233
+ By class:
234
+ precision recall f1-score support
235
+
236
+ LOC 0.6831 0.6723 0.6777 946
237
+ BUILDING 0.2317 0.1027 0.1423 185
238
+ STREET 0.5833 0.1250 0.2059 56
239
+
240
+ micro avg 0.6459 0.5577 0.5986 1187
241
+ macro avg 0.4994 0.3000 0.3420 1187
242
+ weighted avg 0.6081 0.5577 0.5720 1187
243
+
244
+ 2023-10-20 10:00:28,836 ----------------------------------------------------------------------------------------------------