stefan-it's picture
Upload folder using huggingface_hub
049d7f7
2023-10-13 18:27:41,589 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,592 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-13 18:27:41,592 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,593 MultiCorpus: 6183 train + 680 dev + 2113 test sentences
- NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator
2023-10-13 18:27:41,593 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,593 Train: 6183 sentences
2023-10-13 18:27:41,593 (train_with_dev=False, train_with_test=False)
2023-10-13 18:27:41,593 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,593 Training Params:
2023-10-13 18:27:41,593 - learning_rate: "0.00016"
2023-10-13 18:27:41,593 - mini_batch_size: "8"
2023-10-13 18:27:41,593 - max_epochs: "10"
2023-10-13 18:27:41,593 - shuffle: "True"
2023-10-13 18:27:41,593 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,593 Plugins:
2023-10-13 18:27:41,594 - TensorboardLogger
2023-10-13 18:27:41,594 - LinearScheduler | warmup_fraction: '0.1'
2023-10-13 18:27:41,594 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,594 Final evaluation on model from best epoch (best-model.pt)
2023-10-13 18:27:41,594 - metric: "('micro avg', 'f1-score')"
2023-10-13 18:27:41,594 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,594 Computation:
2023-10-13 18:27:41,594 - compute on device: cuda:0
2023-10-13 18:27:41,594 - embedding storage: none
2023-10-13 18:27:41,594 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,594 Model training base path: "hmbench-topres19th/en-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs8-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-3"
2023-10-13 18:27:41,594 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,594 ----------------------------------------------------------------------------------------------------
2023-10-13 18:27:41,595 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-13 18:28:20,526 epoch 1 - iter 77/773 - loss 2.53718000 - time (sec): 38.93 - samples/sec: 296.56 - lr: 0.000016 - momentum: 0.000000
2023-10-13 18:29:01,595 epoch 1 - iter 154/773 - loss 2.48898463 - time (sec): 80.00 - samples/sec: 300.97 - lr: 0.000032 - momentum: 0.000000
2023-10-13 18:29:42,476 epoch 1 - iter 231/773 - loss 2.31540883 - time (sec): 120.88 - samples/sec: 302.87 - lr: 0.000048 - momentum: 0.000000
2023-10-13 18:30:23,429 epoch 1 - iter 308/773 - loss 2.09255303 - time (sec): 161.83 - samples/sec: 303.88 - lr: 0.000064 - momentum: 0.000000
2023-10-13 18:31:04,341 epoch 1 - iter 385/773 - loss 1.87066774 - time (sec): 202.74 - samples/sec: 301.42 - lr: 0.000079 - momentum: 0.000000
2023-10-13 18:31:45,007 epoch 1 - iter 462/773 - loss 1.63966595 - time (sec): 243.41 - samples/sec: 301.19 - lr: 0.000095 - momentum: 0.000000
2023-10-13 18:32:25,330 epoch 1 - iter 539/773 - loss 1.44183445 - time (sec): 283.73 - samples/sec: 301.82 - lr: 0.000111 - momentum: 0.000000
2023-10-13 18:33:05,452 epoch 1 - iter 616/773 - loss 1.28582098 - time (sec): 323.85 - samples/sec: 303.52 - lr: 0.000127 - momentum: 0.000000
2023-10-13 18:33:45,138 epoch 1 - iter 693/773 - loss 1.16590252 - time (sec): 363.54 - samples/sec: 304.47 - lr: 0.000143 - momentum: 0.000000
2023-10-13 18:34:25,653 epoch 1 - iter 770/773 - loss 1.05920427 - time (sec): 404.06 - samples/sec: 306.69 - lr: 0.000159 - momentum: 0.000000
2023-10-13 18:34:27,131 ----------------------------------------------------------------------------------------------------
2023-10-13 18:34:27,131 EPOCH 1 done: loss 1.0564 - lr: 0.000159
2023-10-13 18:34:44,289 DEV : loss 0.09498981386423111 - f1-score (micro avg) 0.1032
2023-10-13 18:34:44,326 saving best model
2023-10-13 18:34:45,254 ----------------------------------------------------------------------------------------------------
2023-10-13 18:35:24,550 epoch 2 - iter 77/773 - loss 0.13415039 - time (sec): 39.29 - samples/sec: 285.68 - lr: 0.000158 - momentum: 0.000000
2023-10-13 18:36:04,547 epoch 2 - iter 154/773 - loss 0.12506112 - time (sec): 79.29 - samples/sec: 292.52 - lr: 0.000156 - momentum: 0.000000
2023-10-13 18:36:45,745 epoch 2 - iter 231/773 - loss 0.12251225 - time (sec): 120.49 - samples/sec: 301.08 - lr: 0.000155 - momentum: 0.000000
2023-10-13 18:37:25,876 epoch 2 - iter 308/773 - loss 0.12159004 - time (sec): 160.62 - samples/sec: 304.42 - lr: 0.000153 - momentum: 0.000000
2023-10-13 18:38:06,724 epoch 2 - iter 385/773 - loss 0.11966517 - time (sec): 201.47 - samples/sec: 305.43 - lr: 0.000151 - momentum: 0.000000
2023-10-13 18:38:47,004 epoch 2 - iter 462/773 - loss 0.11708834 - time (sec): 241.75 - samples/sec: 302.56 - lr: 0.000149 - momentum: 0.000000
2023-10-13 18:39:27,016 epoch 2 - iter 539/773 - loss 0.11412131 - time (sec): 281.76 - samples/sec: 301.08 - lr: 0.000148 - momentum: 0.000000
2023-10-13 18:40:09,251 epoch 2 - iter 616/773 - loss 0.11026100 - time (sec): 323.99 - samples/sec: 304.15 - lr: 0.000146 - momentum: 0.000000
2023-10-13 18:40:50,301 epoch 2 - iter 693/773 - loss 0.10712482 - time (sec): 365.04 - samples/sec: 302.70 - lr: 0.000144 - momentum: 0.000000
2023-10-13 18:41:31,846 epoch 2 - iter 770/773 - loss 0.10501421 - time (sec): 406.59 - samples/sec: 304.87 - lr: 0.000142 - momentum: 0.000000
2023-10-13 18:41:33,254 ----------------------------------------------------------------------------------------------------
2023-10-13 18:41:33,255 EPOCH 2 done: loss 0.1052 - lr: 0.000142
2023-10-13 18:41:50,173 DEV : loss 0.06088424101471901 - f1-score (micro avg) 0.7383
2023-10-13 18:41:50,217 saving best model
2023-10-13 18:41:52,843 ----------------------------------------------------------------------------------------------------
2023-10-13 18:42:34,205 epoch 3 - iter 77/773 - loss 0.06486646 - time (sec): 41.36 - samples/sec: 310.77 - lr: 0.000140 - momentum: 0.000000
2023-10-13 18:43:14,268 epoch 3 - iter 154/773 - loss 0.07228987 - time (sec): 81.42 - samples/sec: 309.51 - lr: 0.000139 - momentum: 0.000000
2023-10-13 18:43:54,505 epoch 3 - iter 231/773 - loss 0.06639483 - time (sec): 121.66 - samples/sec: 307.49 - lr: 0.000137 - momentum: 0.000000
2023-10-13 18:44:34,248 epoch 3 - iter 308/773 - loss 0.06650222 - time (sec): 161.40 - samples/sec: 306.63 - lr: 0.000135 - momentum: 0.000000
2023-10-13 18:45:13,628 epoch 3 - iter 385/773 - loss 0.06605857 - time (sec): 200.78 - samples/sec: 305.42 - lr: 0.000133 - momentum: 0.000000
2023-10-13 18:45:53,131 epoch 3 - iter 462/773 - loss 0.06528790 - time (sec): 240.28 - samples/sec: 307.54 - lr: 0.000132 - momentum: 0.000000
2023-10-13 18:46:33,090 epoch 3 - iter 539/773 - loss 0.06329194 - time (sec): 280.24 - samples/sec: 308.45 - lr: 0.000130 - momentum: 0.000000
2023-10-13 18:47:13,860 epoch 3 - iter 616/773 - loss 0.06104666 - time (sec): 321.01 - samples/sec: 307.92 - lr: 0.000128 - momentum: 0.000000
2023-10-13 18:47:54,228 epoch 3 - iter 693/773 - loss 0.06137439 - time (sec): 361.38 - samples/sec: 307.73 - lr: 0.000126 - momentum: 0.000000
2023-10-13 18:48:34,430 epoch 3 - iter 770/773 - loss 0.06200833 - time (sec): 401.58 - samples/sec: 307.89 - lr: 0.000125 - momentum: 0.000000
2023-10-13 18:48:36,054 ----------------------------------------------------------------------------------------------------
2023-10-13 18:48:36,054 EPOCH 3 done: loss 0.0622 - lr: 0.000125
2023-10-13 18:48:53,579 DEV : loss 0.06357744336128235 - f1-score (micro avg) 0.766
2023-10-13 18:48:53,609 saving best model
2023-10-13 18:48:56,619 ----------------------------------------------------------------------------------------------------
2023-10-13 18:49:36,633 epoch 4 - iter 77/773 - loss 0.03897836 - time (sec): 40.01 - samples/sec: 298.95 - lr: 0.000123 - momentum: 0.000000
2023-10-13 18:50:16,835 epoch 4 - iter 154/773 - loss 0.04517362 - time (sec): 80.21 - samples/sec: 300.48 - lr: 0.000121 - momentum: 0.000000
2023-10-13 18:50:57,845 epoch 4 - iter 231/773 - loss 0.04320695 - time (sec): 121.22 - samples/sec: 303.25 - lr: 0.000119 - momentum: 0.000000
2023-10-13 18:51:37,322 epoch 4 - iter 308/773 - loss 0.04049507 - time (sec): 160.70 - samples/sec: 301.14 - lr: 0.000117 - momentum: 0.000000
2023-10-13 18:52:16,144 epoch 4 - iter 385/773 - loss 0.04133745 - time (sec): 199.52 - samples/sec: 304.51 - lr: 0.000116 - momentum: 0.000000
2023-10-13 18:52:58,384 epoch 4 - iter 462/773 - loss 0.04043949 - time (sec): 241.76 - samples/sec: 305.71 - lr: 0.000114 - momentum: 0.000000
2023-10-13 18:53:39,335 epoch 4 - iter 539/773 - loss 0.04060850 - time (sec): 282.71 - samples/sec: 304.07 - lr: 0.000112 - momentum: 0.000000
2023-10-13 18:54:20,890 epoch 4 - iter 616/773 - loss 0.04091750 - time (sec): 324.27 - samples/sec: 304.60 - lr: 0.000110 - momentum: 0.000000
2023-10-13 18:55:02,576 epoch 4 - iter 693/773 - loss 0.04019435 - time (sec): 365.95 - samples/sec: 304.98 - lr: 0.000109 - momentum: 0.000000
2023-10-13 18:55:44,589 epoch 4 - iter 770/773 - loss 0.03982455 - time (sec): 407.97 - samples/sec: 303.41 - lr: 0.000107 - momentum: 0.000000
2023-10-13 18:55:46,197 ----------------------------------------------------------------------------------------------------
2023-10-13 18:55:46,198 EPOCH 4 done: loss 0.0398 - lr: 0.000107
2023-10-13 18:56:02,928 DEV : loss 0.06280948221683502 - f1-score (micro avg) 0.7692
2023-10-13 18:56:02,957 saving best model
2023-10-13 18:56:05,585 ----------------------------------------------------------------------------------------------------
2023-10-13 18:56:47,185 epoch 5 - iter 77/773 - loss 0.02565174 - time (sec): 41.60 - samples/sec: 314.77 - lr: 0.000105 - momentum: 0.000000
2023-10-13 18:57:28,583 epoch 5 - iter 154/773 - loss 0.02318898 - time (sec): 82.99 - samples/sec: 294.64 - lr: 0.000103 - momentum: 0.000000
2023-10-13 18:58:10,782 epoch 5 - iter 231/773 - loss 0.02462375 - time (sec): 125.19 - samples/sec: 297.66 - lr: 0.000101 - momentum: 0.000000
2023-10-13 18:58:53,107 epoch 5 - iter 308/773 - loss 0.02474654 - time (sec): 167.52 - samples/sec: 297.31 - lr: 0.000100 - momentum: 0.000000
2023-10-13 18:59:35,526 epoch 5 - iter 385/773 - loss 0.02677004 - time (sec): 209.94 - samples/sec: 299.11 - lr: 0.000098 - momentum: 0.000000
2023-10-13 19:00:18,220 epoch 5 - iter 462/773 - loss 0.02659318 - time (sec): 252.63 - samples/sec: 298.87 - lr: 0.000096 - momentum: 0.000000
2023-10-13 19:01:00,301 epoch 5 - iter 539/773 - loss 0.02661496 - time (sec): 294.71 - samples/sec: 297.66 - lr: 0.000094 - momentum: 0.000000
2023-10-13 19:01:41,476 epoch 5 - iter 616/773 - loss 0.02613464 - time (sec): 335.89 - samples/sec: 298.65 - lr: 0.000093 - momentum: 0.000000
2023-10-13 19:02:22,988 epoch 5 - iter 693/773 - loss 0.02685763 - time (sec): 377.40 - samples/sec: 297.89 - lr: 0.000091 - momentum: 0.000000
2023-10-13 19:03:02,927 epoch 5 - iter 770/773 - loss 0.02640570 - time (sec): 417.34 - samples/sec: 296.87 - lr: 0.000089 - momentum: 0.000000
2023-10-13 19:03:04,362 ----------------------------------------------------------------------------------------------------
2023-10-13 19:03:04,362 EPOCH 5 done: loss 0.0265 - lr: 0.000089
2023-10-13 19:03:21,469 DEV : loss 0.07900705188512802 - f1-score (micro avg) 0.7648
2023-10-13 19:03:21,500 ----------------------------------------------------------------------------------------------------
2023-10-13 19:03:59,895 epoch 6 - iter 77/773 - loss 0.01890750 - time (sec): 38.39 - samples/sec: 344.13 - lr: 0.000087 - momentum: 0.000000
2023-10-13 19:04:36,506 epoch 6 - iter 154/773 - loss 0.01803645 - time (sec): 75.00 - samples/sec: 328.28 - lr: 0.000085 - momentum: 0.000000
2023-10-13 19:05:14,634 epoch 6 - iter 231/773 - loss 0.01845279 - time (sec): 113.13 - samples/sec: 334.94 - lr: 0.000084 - momentum: 0.000000
2023-10-13 19:05:52,977 epoch 6 - iter 308/773 - loss 0.01799186 - time (sec): 151.48 - samples/sec: 331.14 - lr: 0.000082 - momentum: 0.000000
2023-10-13 19:06:33,411 epoch 6 - iter 385/773 - loss 0.01752207 - time (sec): 191.91 - samples/sec: 321.86 - lr: 0.000080 - momentum: 0.000000
2023-10-13 19:07:14,518 epoch 6 - iter 462/773 - loss 0.01782621 - time (sec): 233.02 - samples/sec: 319.82 - lr: 0.000078 - momentum: 0.000000
2023-10-13 19:07:54,522 epoch 6 - iter 539/773 - loss 0.01774540 - time (sec): 273.02 - samples/sec: 316.30 - lr: 0.000077 - momentum: 0.000000
2023-10-13 19:08:33,471 epoch 6 - iter 616/773 - loss 0.01739339 - time (sec): 311.97 - samples/sec: 314.26 - lr: 0.000075 - momentum: 0.000000
2023-10-13 19:09:13,056 epoch 6 - iter 693/773 - loss 0.01719655 - time (sec): 351.55 - samples/sec: 313.71 - lr: 0.000073 - momentum: 0.000000
2023-10-13 19:09:54,830 epoch 6 - iter 770/773 - loss 0.01749948 - time (sec): 393.33 - samples/sec: 315.00 - lr: 0.000071 - momentum: 0.000000
2023-10-13 19:09:56,293 ----------------------------------------------------------------------------------------------------
2023-10-13 19:09:56,293 EPOCH 6 done: loss 0.0176 - lr: 0.000071
2023-10-13 19:10:13,261 DEV : loss 0.08565299212932587 - f1-score (micro avg) 0.7816
2023-10-13 19:10:13,290 saving best model
2023-10-13 19:10:15,923 ----------------------------------------------------------------------------------------------------
2023-10-13 19:10:57,165 epoch 7 - iter 77/773 - loss 0.00810359 - time (sec): 41.24 - samples/sec: 309.57 - lr: 0.000069 - momentum: 0.000000
2023-10-13 19:11:38,072 epoch 7 - iter 154/773 - loss 0.01107264 - time (sec): 82.14 - samples/sec: 302.53 - lr: 0.000068 - momentum: 0.000000
2023-10-13 19:12:18,844 epoch 7 - iter 231/773 - loss 0.01177975 - time (sec): 122.92 - samples/sec: 303.13 - lr: 0.000066 - momentum: 0.000000
2023-10-13 19:12:59,044 epoch 7 - iter 308/773 - loss 0.01189685 - time (sec): 163.12 - samples/sec: 306.33 - lr: 0.000064 - momentum: 0.000000
2023-10-13 19:13:38,918 epoch 7 - iter 385/773 - loss 0.01172982 - time (sec): 202.99 - samples/sec: 306.43 - lr: 0.000062 - momentum: 0.000000
2023-10-13 19:14:19,066 epoch 7 - iter 462/773 - loss 0.01138355 - time (sec): 243.14 - samples/sec: 304.92 - lr: 0.000061 - momentum: 0.000000
2023-10-13 19:14:58,917 epoch 7 - iter 539/773 - loss 0.01127963 - time (sec): 282.99 - samples/sec: 305.52 - lr: 0.000059 - momentum: 0.000000
2023-10-13 19:15:38,339 epoch 7 - iter 616/773 - loss 0.01107735 - time (sec): 322.41 - samples/sec: 307.52 - lr: 0.000057 - momentum: 0.000000
2023-10-13 19:16:18,228 epoch 7 - iter 693/773 - loss 0.01205704 - time (sec): 362.30 - samples/sec: 307.05 - lr: 0.000055 - momentum: 0.000000
2023-10-13 19:16:58,631 epoch 7 - iter 770/773 - loss 0.01195266 - time (sec): 402.70 - samples/sec: 307.39 - lr: 0.000054 - momentum: 0.000000
2023-10-13 19:17:00,148 ----------------------------------------------------------------------------------------------------
2023-10-13 19:17:00,148 EPOCH 7 done: loss 0.0120 - lr: 0.000054
2023-10-13 19:17:17,015 DEV : loss 0.08720681816339493 - f1-score (micro avg) 0.7968
2023-10-13 19:17:17,044 saving best model
2023-10-13 19:17:19,668 ----------------------------------------------------------------------------------------------------
2023-10-13 19:18:00,532 epoch 8 - iter 77/773 - loss 0.01118203 - time (sec): 40.86 - samples/sec: 326.21 - lr: 0.000052 - momentum: 0.000000
2023-10-13 19:18:40,280 epoch 8 - iter 154/773 - loss 0.01018958 - time (sec): 80.61 - samples/sec: 322.64 - lr: 0.000050 - momentum: 0.000000
2023-10-13 19:19:19,475 epoch 8 - iter 231/773 - loss 0.01021190 - time (sec): 119.80 - samples/sec: 316.30 - lr: 0.000048 - momentum: 0.000000
2023-10-13 19:19:58,615 epoch 8 - iter 308/773 - loss 0.00973996 - time (sec): 158.94 - samples/sec: 316.65 - lr: 0.000046 - momentum: 0.000000
2023-10-13 19:20:39,703 epoch 8 - iter 385/773 - loss 0.01031258 - time (sec): 200.03 - samples/sec: 318.96 - lr: 0.000045 - momentum: 0.000000
2023-10-13 19:21:21,213 epoch 8 - iter 462/773 - loss 0.01139138 - time (sec): 241.54 - samples/sec: 314.56 - lr: 0.000043 - momentum: 0.000000
2023-10-13 19:22:00,888 epoch 8 - iter 539/773 - loss 0.01032455 - time (sec): 281.22 - samples/sec: 313.49 - lr: 0.000041 - momentum: 0.000000
2023-10-13 19:22:39,467 epoch 8 - iter 616/773 - loss 0.01013119 - time (sec): 319.79 - samples/sec: 309.88 - lr: 0.000039 - momentum: 0.000000
2023-10-13 19:23:19,791 epoch 8 - iter 693/773 - loss 0.00963902 - time (sec): 360.12 - samples/sec: 308.51 - lr: 0.000038 - momentum: 0.000000
2023-10-13 19:24:00,541 epoch 8 - iter 770/773 - loss 0.00920310 - time (sec): 400.87 - samples/sec: 309.03 - lr: 0.000036 - momentum: 0.000000
2023-10-13 19:24:02,059 ----------------------------------------------------------------------------------------------------
2023-10-13 19:24:02,059 EPOCH 8 done: loss 0.0093 - lr: 0.000036
2023-10-13 19:24:20,534 DEV : loss 0.09538504481315613 - f1-score (micro avg) 0.7842
2023-10-13 19:24:20,568 ----------------------------------------------------------------------------------------------------
2023-10-13 19:25:04,481 epoch 9 - iter 77/773 - loss 0.00761122 - time (sec): 43.91 - samples/sec: 292.32 - lr: 0.000034 - momentum: 0.000000
2023-10-13 19:25:47,057 epoch 9 - iter 154/773 - loss 0.00548572 - time (sec): 86.49 - samples/sec: 298.14 - lr: 0.000032 - momentum: 0.000000
2023-10-13 19:26:27,944 epoch 9 - iter 231/773 - loss 0.00513351 - time (sec): 127.37 - samples/sec: 292.68 - lr: 0.000030 - momentum: 0.000000
2023-10-13 19:27:07,960 epoch 9 - iter 308/773 - loss 0.00485361 - time (sec): 167.39 - samples/sec: 294.65 - lr: 0.000029 - momentum: 0.000000
2023-10-13 19:27:49,243 epoch 9 - iter 385/773 - loss 0.00561098 - time (sec): 208.67 - samples/sec: 294.06 - lr: 0.000027 - momentum: 0.000000
2023-10-13 19:28:30,009 epoch 9 - iter 462/773 - loss 0.00629165 - time (sec): 249.44 - samples/sec: 292.00 - lr: 0.000025 - momentum: 0.000000
2023-10-13 19:29:10,691 epoch 9 - iter 539/773 - loss 0.00715536 - time (sec): 290.12 - samples/sec: 295.69 - lr: 0.000023 - momentum: 0.000000
2023-10-13 19:29:50,910 epoch 9 - iter 616/773 - loss 0.00747397 - time (sec): 330.34 - samples/sec: 296.84 - lr: 0.000022 - momentum: 0.000000
2023-10-13 19:30:30,991 epoch 9 - iter 693/773 - loss 0.00752234 - time (sec): 370.42 - samples/sec: 297.66 - lr: 0.000020 - momentum: 0.000000
2023-10-13 19:31:12,553 epoch 9 - iter 770/773 - loss 0.00710341 - time (sec): 411.98 - samples/sec: 300.62 - lr: 0.000018 - momentum: 0.000000
2023-10-13 19:31:14,034 ----------------------------------------------------------------------------------------------------
2023-10-13 19:31:14,034 EPOCH 9 done: loss 0.0071 - lr: 0.000018
2023-10-13 19:31:31,165 DEV : loss 0.09801825881004333 - f1-score (micro avg) 0.8065
2023-10-13 19:31:31,194 saving best model
2023-10-13 19:31:33,834 ----------------------------------------------------------------------------------------------------
2023-10-13 19:32:15,371 epoch 10 - iter 77/773 - loss 0.00436375 - time (sec): 41.53 - samples/sec: 292.59 - lr: 0.000016 - momentum: 0.000000
2023-10-13 19:32:55,311 epoch 10 - iter 154/773 - loss 0.00453929 - time (sec): 81.47 - samples/sec: 289.80 - lr: 0.000014 - momentum: 0.000000
2023-10-13 19:33:36,007 epoch 10 - iter 231/773 - loss 0.00590183 - time (sec): 122.17 - samples/sec: 289.44 - lr: 0.000013 - momentum: 0.000000
2023-10-13 19:34:16,478 epoch 10 - iter 308/773 - loss 0.00508722 - time (sec): 162.64 - samples/sec: 295.92 - lr: 0.000011 - momentum: 0.000000
2023-10-13 19:34:57,766 epoch 10 - iter 385/773 - loss 0.00493063 - time (sec): 203.93 - samples/sec: 296.65 - lr: 0.000009 - momentum: 0.000000
2023-10-13 19:35:38,905 epoch 10 - iter 462/773 - loss 0.00476355 - time (sec): 245.07 - samples/sec: 300.42 - lr: 0.000007 - momentum: 0.000000
2023-10-13 19:36:19,615 epoch 10 - iter 539/773 - loss 0.00511343 - time (sec): 285.78 - samples/sec: 301.60 - lr: 0.000006 - momentum: 0.000000
2023-10-13 19:37:01,549 epoch 10 - iter 616/773 - loss 0.00517065 - time (sec): 327.71 - samples/sec: 303.13 - lr: 0.000004 - momentum: 0.000000
2023-10-13 19:37:42,385 epoch 10 - iter 693/773 - loss 0.00527857 - time (sec): 368.55 - samples/sec: 302.91 - lr: 0.000002 - momentum: 0.000000
2023-10-13 19:38:22,556 epoch 10 - iter 770/773 - loss 0.00509281 - time (sec): 408.72 - samples/sec: 302.80 - lr: 0.000000 - momentum: 0.000000
2023-10-13 19:38:24,138 ----------------------------------------------------------------------------------------------------
2023-10-13 19:38:24,138 EPOCH 10 done: loss 0.0051 - lr: 0.000000
2023-10-13 19:38:41,062 DEV : loss 0.10153676569461823 - f1-score (micro avg) 0.8008
2023-10-13 19:38:42,015 ----------------------------------------------------------------------------------------------------
2023-10-13 19:38:42,017 Loading model from best epoch ...
2023-10-13 19:38:46,544 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET
2023-10-13 19:39:42,682
Results:
- F-score (micro) 0.7988
- F-score (macro) 0.7062
- Accuracy 0.6839
By class:
precision recall f1-score support
LOC 0.8541 0.8541 0.8541 946
BUILDING 0.5500 0.4757 0.5101 185
STREET 0.7414 0.7679 0.7544 56
micro avg 0.8067 0.7911 0.7988 1187
macro avg 0.7152 0.6992 0.7062 1187
weighted avg 0.8014 0.7911 0.7958 1187
2023-10-13 19:39:42,682 ----------------------------------------------------------------------------------------------------