stefan-it's picture
Upload folder using huggingface_hub
23eecbb
raw
history blame
25.3 kB
2023-10-12 16:33:19,371 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,373 Model: "SequenceTagger(
(embeddings): ByT5Embeddings(
(model): T5EncoderModel(
(shared): Embedding(384, 1472)
(encoder): T5Stack(
(embed_tokens): Embedding(384, 1472)
(block): ModuleList(
(0): T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
(relative_attention_bias): Embedding(32, 6)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(1-11): 11 x T5Block(
(layer): ModuleList(
(0): T5LayerSelfAttention(
(SelfAttention): T5Attention(
(q): Linear(in_features=1472, out_features=384, bias=False)
(k): Linear(in_features=1472, out_features=384, bias=False)
(v): Linear(in_features=1472, out_features=384, bias=False)
(o): Linear(in_features=384, out_features=1472, bias=False)
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(1): T5LayerFF(
(DenseReluDense): T5DenseGatedActDense(
(wi_0): Linear(in_features=1472, out_features=3584, bias=False)
(wi_1): Linear(in_features=1472, out_features=3584, bias=False)
(wo): Linear(in_features=3584, out_features=1472, bias=False)
(dropout): Dropout(p=0.1, inplace=False)
(act): NewGELUActivation()
)
(layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(final_layer_norm): FusedRMSNorm(torch.Size([1472]), eps=1e-06, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=1472, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-12 16:33:19,373 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,374 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-12 16:33:19,374 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,374 Train: 5777 sentences
2023-10-12 16:33:19,374 (train_with_dev=False, train_with_test=False)
2023-10-12 16:33:19,374 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,374 Training Params:
2023-10-12 16:33:19,374 - learning_rate: "0.00016"
2023-10-12 16:33:19,374 - mini_batch_size: "4"
2023-10-12 16:33:19,374 - max_epochs: "10"
2023-10-12 16:33:19,374 - shuffle: "True"
2023-10-12 16:33:19,374 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,374 Plugins:
2023-10-12 16:33:19,374 - TensorboardLogger
2023-10-12 16:33:19,375 - LinearScheduler | warmup_fraction: '0.1'
2023-10-12 16:33:19,375 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,375 Final evaluation on model from best epoch (best-model.pt)
2023-10-12 16:33:19,375 - metric: "('micro avg', 'f1-score')"
2023-10-12 16:33:19,375 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,375 Computation:
2023-10-12 16:33:19,375 - compute on device: cuda:0
2023-10-12 16:33:19,375 - embedding storage: none
2023-10-12 16:33:19,375 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,375 Model training base path: "hmbench-icdar/nl-hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax-bs4-wsFalse-e10-lr0.00016-poolingfirst-layers-1-crfFalse-4"
2023-10-12 16:33:19,375 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,375 ----------------------------------------------------------------------------------------------------
2023-10-12 16:33:19,375 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-12 16:34:03,482 epoch 1 - iter 144/1445 - loss 2.56300254 - time (sec): 44.10 - samples/sec: 405.02 - lr: 0.000016 - momentum: 0.000000
2023-10-12 16:34:47,886 epoch 1 - iter 288/1445 - loss 2.38526191 - time (sec): 88.51 - samples/sec: 407.78 - lr: 0.000032 - momentum: 0.000000
2023-10-12 16:35:30,464 epoch 1 - iter 432/1445 - loss 2.13289859 - time (sec): 131.09 - samples/sec: 405.11 - lr: 0.000048 - momentum: 0.000000
2023-10-12 16:36:13,934 epoch 1 - iter 576/1445 - loss 1.84024411 - time (sec): 174.56 - samples/sec: 405.11 - lr: 0.000064 - momentum: 0.000000
2023-10-12 16:36:56,656 epoch 1 - iter 720/1445 - loss 1.55933001 - time (sec): 217.28 - samples/sec: 406.68 - lr: 0.000080 - momentum: 0.000000
2023-10-12 16:37:37,250 epoch 1 - iter 864/1445 - loss 1.35312862 - time (sec): 257.87 - samples/sec: 407.98 - lr: 0.000096 - momentum: 0.000000
2023-10-12 16:38:19,681 epoch 1 - iter 1008/1445 - loss 1.19495096 - time (sec): 300.30 - samples/sec: 407.48 - lr: 0.000112 - momentum: 0.000000
2023-10-12 16:39:02,528 epoch 1 - iter 1152/1445 - loss 1.06319855 - time (sec): 343.15 - samples/sec: 409.47 - lr: 0.000127 - momentum: 0.000000
2023-10-12 16:39:44,972 epoch 1 - iter 1296/1445 - loss 0.96042465 - time (sec): 385.59 - samples/sec: 412.03 - lr: 0.000143 - momentum: 0.000000
2023-10-12 16:40:27,593 epoch 1 - iter 1440/1445 - loss 0.88488446 - time (sec): 428.22 - samples/sec: 410.24 - lr: 0.000159 - momentum: 0.000000
2023-10-12 16:40:28,827 ----------------------------------------------------------------------------------------------------
2023-10-12 16:40:28,828 EPOCH 1 done: loss 0.8825 - lr: 0.000159
2023-10-12 16:40:48,696 DEV : loss 0.18981926143169403 - f1-score (micro avg) 0.3887
2023-10-12 16:40:48,730 saving best model
2023-10-12 16:40:49,705 ----------------------------------------------------------------------------------------------------
2023-10-12 16:41:32,871 epoch 2 - iter 144/1445 - loss 0.13981313 - time (sec): 43.16 - samples/sec: 410.92 - lr: 0.000158 - momentum: 0.000000
2023-10-12 16:42:13,700 epoch 2 - iter 288/1445 - loss 0.13956803 - time (sec): 83.99 - samples/sec: 415.01 - lr: 0.000156 - momentum: 0.000000
2023-10-12 16:42:54,732 epoch 2 - iter 432/1445 - loss 0.13836300 - time (sec): 125.02 - samples/sec: 417.55 - lr: 0.000155 - momentum: 0.000000
2023-10-12 16:43:36,244 epoch 2 - iter 576/1445 - loss 0.13112203 - time (sec): 166.54 - samples/sec: 417.27 - lr: 0.000153 - momentum: 0.000000
2023-10-12 16:44:19,336 epoch 2 - iter 720/1445 - loss 0.12498734 - time (sec): 209.63 - samples/sec: 409.49 - lr: 0.000151 - momentum: 0.000000
2023-10-12 16:45:03,660 epoch 2 - iter 864/1445 - loss 0.12218588 - time (sec): 253.95 - samples/sec: 407.93 - lr: 0.000149 - momentum: 0.000000
2023-10-12 16:45:46,648 epoch 2 - iter 1008/1445 - loss 0.12188956 - time (sec): 296.94 - samples/sec: 410.26 - lr: 0.000148 - momentum: 0.000000
2023-10-12 16:46:27,790 epoch 2 - iter 1152/1445 - loss 0.11964934 - time (sec): 338.08 - samples/sec: 413.86 - lr: 0.000146 - momentum: 0.000000
2023-10-12 16:47:09,340 epoch 2 - iter 1296/1445 - loss 0.11521379 - time (sec): 379.63 - samples/sec: 415.90 - lr: 0.000144 - momentum: 0.000000
2023-10-12 16:47:50,210 epoch 2 - iter 1440/1445 - loss 0.11336959 - time (sec): 420.50 - samples/sec: 417.31 - lr: 0.000142 - momentum: 0.000000
2023-10-12 16:47:51,614 ----------------------------------------------------------------------------------------------------
2023-10-12 16:47:51,614 EPOCH 2 done: loss 0.1131 - lr: 0.000142
2023-10-12 16:48:12,874 DEV : loss 0.09119933843612671 - f1-score (micro avg) 0.8308
2023-10-12 16:48:12,904 saving best model
2023-10-12 16:48:15,491 ----------------------------------------------------------------------------------------------------
2023-10-12 16:48:57,020 epoch 3 - iter 144/1445 - loss 0.07985533 - time (sec): 41.52 - samples/sec: 405.49 - lr: 0.000140 - momentum: 0.000000
2023-10-12 16:49:38,239 epoch 3 - iter 288/1445 - loss 0.07523997 - time (sec): 82.74 - samples/sec: 418.79 - lr: 0.000139 - momentum: 0.000000
2023-10-12 16:50:18,976 epoch 3 - iter 432/1445 - loss 0.07488360 - time (sec): 123.48 - samples/sec: 419.10 - lr: 0.000137 - momentum: 0.000000
2023-10-12 16:50:59,095 epoch 3 - iter 576/1445 - loss 0.07112822 - time (sec): 163.60 - samples/sec: 421.60 - lr: 0.000135 - momentum: 0.000000
2023-10-12 16:51:39,727 epoch 3 - iter 720/1445 - loss 0.07107570 - time (sec): 204.23 - samples/sec: 423.37 - lr: 0.000133 - momentum: 0.000000
2023-10-12 16:52:23,561 epoch 3 - iter 864/1445 - loss 0.07062393 - time (sec): 248.06 - samples/sec: 425.40 - lr: 0.000132 - momentum: 0.000000
2023-10-12 16:53:06,138 epoch 3 - iter 1008/1445 - loss 0.06769948 - time (sec): 290.64 - samples/sec: 424.70 - lr: 0.000130 - momentum: 0.000000
2023-10-12 16:53:47,855 epoch 3 - iter 1152/1445 - loss 0.06676501 - time (sec): 332.36 - samples/sec: 423.10 - lr: 0.000128 - momentum: 0.000000
2023-10-12 16:54:29,300 epoch 3 - iter 1296/1445 - loss 0.06508146 - time (sec): 373.80 - samples/sec: 422.23 - lr: 0.000126 - momentum: 0.000000
2023-10-12 16:55:11,121 epoch 3 - iter 1440/1445 - loss 0.06435452 - time (sec): 415.62 - samples/sec: 422.69 - lr: 0.000125 - momentum: 0.000000
2023-10-12 16:55:12,348 ----------------------------------------------------------------------------------------------------
2023-10-12 16:55:12,349 EPOCH 3 done: loss 0.0644 - lr: 0.000125
2023-10-12 16:55:32,850 DEV : loss 0.06621405482292175 - f1-score (micro avg) 0.8638
2023-10-12 16:55:32,879 saving best model
2023-10-12 16:55:35,449 ----------------------------------------------------------------------------------------------------
2023-10-12 16:56:16,669 epoch 4 - iter 144/1445 - loss 0.03491763 - time (sec): 41.22 - samples/sec: 458.62 - lr: 0.000123 - momentum: 0.000000
2023-10-12 16:56:55,734 epoch 4 - iter 288/1445 - loss 0.03857362 - time (sec): 80.28 - samples/sec: 436.36 - lr: 0.000121 - momentum: 0.000000
2023-10-12 16:57:35,889 epoch 4 - iter 432/1445 - loss 0.03858829 - time (sec): 120.43 - samples/sec: 429.12 - lr: 0.000119 - momentum: 0.000000
2023-10-12 16:58:16,860 epoch 4 - iter 576/1445 - loss 0.04126412 - time (sec): 161.41 - samples/sec: 427.20 - lr: 0.000117 - momentum: 0.000000
2023-10-12 16:58:57,647 epoch 4 - iter 720/1445 - loss 0.04127926 - time (sec): 202.19 - samples/sec: 425.21 - lr: 0.000116 - momentum: 0.000000
2023-10-12 16:59:40,966 epoch 4 - iter 864/1445 - loss 0.04331915 - time (sec): 245.51 - samples/sec: 423.87 - lr: 0.000114 - momentum: 0.000000
2023-10-12 17:00:22,245 epoch 4 - iter 1008/1445 - loss 0.04325634 - time (sec): 286.79 - samples/sec: 429.29 - lr: 0.000112 - momentum: 0.000000
2023-10-12 17:01:01,932 epoch 4 - iter 1152/1445 - loss 0.04376823 - time (sec): 326.48 - samples/sec: 429.38 - lr: 0.000110 - momentum: 0.000000
2023-10-12 17:01:41,831 epoch 4 - iter 1296/1445 - loss 0.04471474 - time (sec): 366.38 - samples/sec: 430.38 - lr: 0.000109 - momentum: 0.000000
2023-10-12 17:02:22,332 epoch 4 - iter 1440/1445 - loss 0.04407434 - time (sec): 406.88 - samples/sec: 431.94 - lr: 0.000107 - momentum: 0.000000
2023-10-12 17:02:23,481 ----------------------------------------------------------------------------------------------------
2023-10-12 17:02:23,481 EPOCH 4 done: loss 0.0441 - lr: 0.000107
2023-10-12 17:02:44,195 DEV : loss 0.08768334984779358 - f1-score (micro avg) 0.8564
2023-10-12 17:02:44,226 ----------------------------------------------------------------------------------------------------
2023-10-12 17:03:24,221 epoch 5 - iter 144/1445 - loss 0.01928414 - time (sec): 39.99 - samples/sec: 424.43 - lr: 0.000105 - momentum: 0.000000
2023-10-12 17:04:04,655 epoch 5 - iter 288/1445 - loss 0.02782982 - time (sec): 80.43 - samples/sec: 427.60 - lr: 0.000103 - momentum: 0.000000
2023-10-12 17:04:44,485 epoch 5 - iter 432/1445 - loss 0.02750831 - time (sec): 120.26 - samples/sec: 433.41 - lr: 0.000101 - momentum: 0.000000
2023-10-12 17:05:25,173 epoch 5 - iter 576/1445 - loss 0.02988547 - time (sec): 160.95 - samples/sec: 438.34 - lr: 0.000100 - momentum: 0.000000
2023-10-12 17:06:05,256 epoch 5 - iter 720/1445 - loss 0.03160699 - time (sec): 201.03 - samples/sec: 438.23 - lr: 0.000098 - momentum: 0.000000
2023-10-12 17:06:45,729 epoch 5 - iter 864/1445 - loss 0.03162838 - time (sec): 241.50 - samples/sec: 437.51 - lr: 0.000096 - momentum: 0.000000
2023-10-12 17:07:24,736 epoch 5 - iter 1008/1445 - loss 0.03117068 - time (sec): 280.51 - samples/sec: 435.41 - lr: 0.000094 - momentum: 0.000000
2023-10-12 17:08:05,615 epoch 5 - iter 1152/1445 - loss 0.03085839 - time (sec): 321.39 - samples/sec: 435.33 - lr: 0.000093 - momentum: 0.000000
2023-10-12 17:08:46,790 epoch 5 - iter 1296/1445 - loss 0.03070485 - time (sec): 362.56 - samples/sec: 434.52 - lr: 0.000091 - momentum: 0.000000
2023-10-12 17:09:27,677 epoch 5 - iter 1440/1445 - loss 0.03200208 - time (sec): 403.45 - samples/sec: 435.47 - lr: 0.000089 - momentum: 0.000000
2023-10-12 17:09:28,879 ----------------------------------------------------------------------------------------------------
2023-10-12 17:09:28,879 EPOCH 5 done: loss 0.0319 - lr: 0.000089
2023-10-12 17:09:49,304 DEV : loss 0.10739118605852127 - f1-score (micro avg) 0.8495
2023-10-12 17:09:49,334 ----------------------------------------------------------------------------------------------------
2023-10-12 17:10:30,193 epoch 6 - iter 144/1445 - loss 0.03926076 - time (sec): 40.86 - samples/sec: 439.95 - lr: 0.000087 - momentum: 0.000000
2023-10-12 17:11:09,843 epoch 6 - iter 288/1445 - loss 0.03195608 - time (sec): 80.51 - samples/sec: 434.47 - lr: 0.000085 - momentum: 0.000000
2023-10-12 17:11:50,454 epoch 6 - iter 432/1445 - loss 0.03002443 - time (sec): 121.12 - samples/sec: 436.80 - lr: 0.000084 - momentum: 0.000000
2023-10-12 17:12:31,958 epoch 6 - iter 576/1445 - loss 0.02767328 - time (sec): 162.62 - samples/sec: 441.01 - lr: 0.000082 - momentum: 0.000000
2023-10-12 17:13:10,662 epoch 6 - iter 720/1445 - loss 0.02642560 - time (sec): 201.33 - samples/sec: 434.28 - lr: 0.000080 - momentum: 0.000000
2023-10-12 17:13:52,438 epoch 6 - iter 864/1445 - loss 0.02552455 - time (sec): 243.10 - samples/sec: 439.35 - lr: 0.000078 - momentum: 0.000000
2023-10-12 17:14:31,894 epoch 6 - iter 1008/1445 - loss 0.02476366 - time (sec): 282.56 - samples/sec: 437.47 - lr: 0.000076 - momentum: 0.000000
2023-10-12 17:15:12,546 epoch 6 - iter 1152/1445 - loss 0.02469782 - time (sec): 323.21 - samples/sec: 436.99 - lr: 0.000075 - momentum: 0.000000
2023-10-12 17:15:54,048 epoch 6 - iter 1296/1445 - loss 0.02478677 - time (sec): 364.71 - samples/sec: 435.29 - lr: 0.000073 - momentum: 0.000000
2023-10-12 17:16:33,885 epoch 6 - iter 1440/1445 - loss 0.02484737 - time (sec): 404.55 - samples/sec: 434.17 - lr: 0.000071 - momentum: 0.000000
2023-10-12 17:16:35,141 ----------------------------------------------------------------------------------------------------
2023-10-12 17:16:35,141 EPOCH 6 done: loss 0.0248 - lr: 0.000071
2023-10-12 17:16:56,074 DEV : loss 0.11206483840942383 - f1-score (micro avg) 0.8582
2023-10-12 17:16:56,105 ----------------------------------------------------------------------------------------------------
2023-10-12 17:17:37,350 epoch 7 - iter 144/1445 - loss 0.03196159 - time (sec): 41.24 - samples/sec: 405.21 - lr: 0.000069 - momentum: 0.000000
2023-10-12 17:18:21,847 epoch 7 - iter 288/1445 - loss 0.02494608 - time (sec): 85.74 - samples/sec: 417.58 - lr: 0.000068 - momentum: 0.000000
2023-10-12 17:19:04,928 epoch 7 - iter 432/1445 - loss 0.02528055 - time (sec): 128.82 - samples/sec: 416.07 - lr: 0.000066 - momentum: 0.000000
2023-10-12 17:19:47,446 epoch 7 - iter 576/1445 - loss 0.02201332 - time (sec): 171.34 - samples/sec: 414.68 - lr: 0.000064 - momentum: 0.000000
2023-10-12 17:20:30,277 epoch 7 - iter 720/1445 - loss 0.02177977 - time (sec): 214.17 - samples/sec: 414.34 - lr: 0.000062 - momentum: 0.000000
2023-10-12 17:21:12,767 epoch 7 - iter 864/1445 - loss 0.02046133 - time (sec): 256.66 - samples/sec: 416.64 - lr: 0.000060 - momentum: 0.000000
2023-10-12 17:21:54,758 epoch 7 - iter 1008/1445 - loss 0.02080315 - time (sec): 298.65 - samples/sec: 419.29 - lr: 0.000059 - momentum: 0.000000
2023-10-12 17:22:35,231 epoch 7 - iter 1152/1445 - loss 0.01995810 - time (sec): 339.12 - samples/sec: 420.44 - lr: 0.000057 - momentum: 0.000000
2023-10-12 17:23:15,074 epoch 7 - iter 1296/1445 - loss 0.01959775 - time (sec): 378.97 - samples/sec: 419.20 - lr: 0.000055 - momentum: 0.000000
2023-10-12 17:23:57,171 epoch 7 - iter 1440/1445 - loss 0.01919288 - time (sec): 421.06 - samples/sec: 417.48 - lr: 0.000053 - momentum: 0.000000
2023-10-12 17:23:58,329 ----------------------------------------------------------------------------------------------------
2023-10-12 17:23:58,330 EPOCH 7 done: loss 0.0191 - lr: 0.000053
2023-10-12 17:24:20,533 DEV : loss 0.12889063358306885 - f1-score (micro avg) 0.853
2023-10-12 17:24:20,562 ----------------------------------------------------------------------------------------------------
2023-10-12 17:25:01,698 epoch 8 - iter 144/1445 - loss 0.01021579 - time (sec): 41.13 - samples/sec: 434.73 - lr: 0.000052 - momentum: 0.000000
2023-10-12 17:25:42,787 epoch 8 - iter 288/1445 - loss 0.01004474 - time (sec): 82.22 - samples/sec: 437.18 - lr: 0.000050 - momentum: 0.000000
2023-10-12 17:26:23,418 epoch 8 - iter 432/1445 - loss 0.01117270 - time (sec): 122.85 - samples/sec: 435.97 - lr: 0.000048 - momentum: 0.000000
2023-10-12 17:27:05,836 epoch 8 - iter 576/1445 - loss 0.01033342 - time (sec): 165.27 - samples/sec: 437.03 - lr: 0.000046 - momentum: 0.000000
2023-10-12 17:27:45,951 epoch 8 - iter 720/1445 - loss 0.01317246 - time (sec): 205.39 - samples/sec: 428.88 - lr: 0.000044 - momentum: 0.000000
2023-10-12 17:28:26,943 epoch 8 - iter 864/1445 - loss 0.01310077 - time (sec): 246.38 - samples/sec: 426.90 - lr: 0.000043 - momentum: 0.000000
2023-10-12 17:29:07,910 epoch 8 - iter 1008/1445 - loss 0.01364110 - time (sec): 287.35 - samples/sec: 426.32 - lr: 0.000041 - momentum: 0.000000
2023-10-12 17:29:50,319 epoch 8 - iter 1152/1445 - loss 0.01391426 - time (sec): 329.75 - samples/sec: 425.34 - lr: 0.000039 - momentum: 0.000000
2023-10-12 17:30:32,149 epoch 8 - iter 1296/1445 - loss 0.01330664 - time (sec): 371.58 - samples/sec: 424.88 - lr: 0.000037 - momentum: 0.000000
2023-10-12 17:31:14,018 epoch 8 - iter 1440/1445 - loss 0.01380146 - time (sec): 413.45 - samples/sec: 425.19 - lr: 0.000036 - momentum: 0.000000
2023-10-12 17:31:15,188 ----------------------------------------------------------------------------------------------------
2023-10-12 17:31:15,188 EPOCH 8 done: loss 0.0138 - lr: 0.000036
2023-10-12 17:31:35,933 DEV : loss 0.12251030653715134 - f1-score (micro avg) 0.8642
2023-10-12 17:31:35,963 saving best model
2023-10-12 17:31:36,938 ----------------------------------------------------------------------------------------------------
2023-10-12 17:32:18,958 epoch 9 - iter 144/1445 - loss 0.00745630 - time (sec): 42.02 - samples/sec: 437.46 - lr: 0.000034 - momentum: 0.000000
2023-10-12 17:33:00,161 epoch 9 - iter 288/1445 - loss 0.00761895 - time (sec): 83.22 - samples/sec: 422.78 - lr: 0.000032 - momentum: 0.000000
2023-10-12 17:33:40,433 epoch 9 - iter 432/1445 - loss 0.00995441 - time (sec): 123.49 - samples/sec: 422.87 - lr: 0.000030 - momentum: 0.000000
2023-10-12 17:34:21,456 epoch 9 - iter 576/1445 - loss 0.00880978 - time (sec): 164.52 - samples/sec: 423.26 - lr: 0.000028 - momentum: 0.000000
2023-10-12 17:35:02,766 epoch 9 - iter 720/1445 - loss 0.00785433 - time (sec): 205.83 - samples/sec: 427.36 - lr: 0.000027 - momentum: 0.000000
2023-10-12 17:35:43,814 epoch 9 - iter 864/1445 - loss 0.00778198 - time (sec): 246.87 - samples/sec: 427.54 - lr: 0.000025 - momentum: 0.000000
2023-10-12 17:36:24,978 epoch 9 - iter 1008/1445 - loss 0.00817747 - time (sec): 288.04 - samples/sec: 430.09 - lr: 0.000023 - momentum: 0.000000
2023-10-12 17:37:05,337 epoch 9 - iter 1152/1445 - loss 0.00847105 - time (sec): 328.40 - samples/sec: 429.48 - lr: 0.000021 - momentum: 0.000000
2023-10-12 17:37:45,918 epoch 9 - iter 1296/1445 - loss 0.00923305 - time (sec): 368.98 - samples/sec: 426.79 - lr: 0.000020 - momentum: 0.000000
2023-10-12 17:38:28,699 epoch 9 - iter 1440/1445 - loss 0.00900821 - time (sec): 411.76 - samples/sec: 425.30 - lr: 0.000018 - momentum: 0.000000
2023-10-12 17:38:30,718 ----------------------------------------------------------------------------------------------------
2023-10-12 17:38:30,718 EPOCH 9 done: loss 0.0096 - lr: 0.000018
2023-10-12 17:38:52,760 DEV : loss 0.14077246189117432 - f1-score (micro avg) 0.8633
2023-10-12 17:38:52,801 ----------------------------------------------------------------------------------------------------
2023-10-12 17:39:37,018 epoch 10 - iter 144/1445 - loss 0.01174630 - time (sec): 44.21 - samples/sec: 426.17 - lr: 0.000016 - momentum: 0.000000
2023-10-12 17:40:20,486 epoch 10 - iter 288/1445 - loss 0.01002501 - time (sec): 87.68 - samples/sec: 415.57 - lr: 0.000014 - momentum: 0.000000
2023-10-12 17:41:04,188 epoch 10 - iter 432/1445 - loss 0.00910839 - time (sec): 131.39 - samples/sec: 407.70 - lr: 0.000012 - momentum: 0.000000
2023-10-12 17:41:48,780 epoch 10 - iter 576/1445 - loss 0.00824494 - time (sec): 175.98 - samples/sec: 405.69 - lr: 0.000011 - momentum: 0.000000
2023-10-12 17:42:33,468 epoch 10 - iter 720/1445 - loss 0.00769188 - time (sec): 220.67 - samples/sec: 404.45 - lr: 0.000009 - momentum: 0.000000
2023-10-12 17:43:20,283 epoch 10 - iter 864/1445 - loss 0.00752313 - time (sec): 267.48 - samples/sec: 403.45 - lr: 0.000007 - momentum: 0.000000
2023-10-12 17:44:01,105 epoch 10 - iter 1008/1445 - loss 0.00804398 - time (sec): 308.30 - samples/sec: 401.60 - lr: 0.000005 - momentum: 0.000000
2023-10-12 17:44:43,927 epoch 10 - iter 1152/1445 - loss 0.00739720 - time (sec): 351.12 - samples/sec: 405.21 - lr: 0.000004 - momentum: 0.000000
2023-10-12 17:45:25,058 epoch 10 - iter 1296/1445 - loss 0.00730540 - time (sec): 392.26 - samples/sec: 404.88 - lr: 0.000002 - momentum: 0.000000
2023-10-12 17:46:07,507 epoch 10 - iter 1440/1445 - loss 0.00754475 - time (sec): 434.70 - samples/sec: 404.40 - lr: 0.000000 - momentum: 0.000000
2023-10-12 17:46:08,714 ----------------------------------------------------------------------------------------------------
2023-10-12 17:46:08,715 EPOCH 10 done: loss 0.0075 - lr: 0.000000
2023-10-12 17:46:30,653 DEV : loss 0.14291653037071228 - f1-score (micro avg) 0.8604
2023-10-12 17:46:31,574 ----------------------------------------------------------------------------------------------------
2023-10-12 17:46:31,576 Loading model from best epoch ...
2023-10-12 17:46:35,453 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-12 17:46:59,023
Results:
- F-score (micro) 0.851
- F-score (macro) 0.744
- Accuracy 0.7524
By class:
precision recall f1-score support
PER 0.8621 0.8693 0.8657 482
LOC 0.9277 0.8690 0.8974 458
ORG 0.4474 0.4928 0.4690 69
micro avg 0.8587 0.8434 0.8510 1009
macro avg 0.7457 0.7437 0.7440 1009
weighted avg 0.8636 0.8434 0.8530 1009
2023-10-12 17:46:59,023 ----------------------------------------------------------------------------------------------------