stefan-it's picture
Upload folder using huggingface_hub
1330b5f
2023-10-17 17:04:55,524 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,525 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:04:55,525 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,525 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 17:04:55,525 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,525 Train: 5777 sentences
2023-10-17 17:04:55,525 (train_with_dev=False, train_with_test=False)
2023-10-17 17:04:55,525 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,525 Training Params:
2023-10-17 17:04:55,525 - learning_rate: "5e-05"
2023-10-17 17:04:55,525 - mini_batch_size: "4"
2023-10-17 17:04:55,525 - max_epochs: "10"
2023-10-17 17:04:55,525 - shuffle: "True"
2023-10-17 17:04:55,525 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,525 Plugins:
2023-10-17 17:04:55,525 - TensorboardLogger
2023-10-17 17:04:55,525 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:04:55,525 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,525 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:04:55,525 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:04:55,525 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,525 Computation:
2023-10-17 17:04:55,525 - compute on device: cuda:0
2023-10-17 17:04:55,525 - embedding storage: none
2023-10-17 17:04:55,525 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,526 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 17:04:55,526 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,526 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:55,526 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:05:02,906 epoch 1 - iter 144/1445 - loss 2.04744536 - time (sec): 7.38 - samples/sec: 2328.00 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:05:10,040 epoch 1 - iter 288/1445 - loss 1.16515708 - time (sec): 14.51 - samples/sec: 2345.32 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:05:16,936 epoch 1 - iter 432/1445 - loss 0.82511112 - time (sec): 21.41 - samples/sec: 2415.80 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:05:24,122 epoch 1 - iter 576/1445 - loss 0.66268920 - time (sec): 28.60 - samples/sec: 2430.47 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:05:31,275 epoch 1 - iter 720/1445 - loss 0.54957029 - time (sec): 35.75 - samples/sec: 2467.67 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:05:38,463 epoch 1 - iter 864/1445 - loss 0.47492151 - time (sec): 42.94 - samples/sec: 2480.51 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:05:45,429 epoch 1 - iter 1008/1445 - loss 0.42416443 - time (sec): 49.90 - samples/sec: 2480.63 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:05:52,469 epoch 1 - iter 1152/1445 - loss 0.38566918 - time (sec): 56.94 - samples/sec: 2483.33 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:05:59,365 epoch 1 - iter 1296/1445 - loss 0.36046029 - time (sec): 63.84 - samples/sec: 2459.71 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:06:06,401 epoch 1 - iter 1440/1445 - loss 0.33304682 - time (sec): 70.87 - samples/sec: 2475.30 - lr: 0.000050 - momentum: 0.000000
2023-10-17 17:06:06,673 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:06,673 EPOCH 1 done: loss 0.3318 - lr: 0.000050
2023-10-17 17:06:09,347 DEV : loss 0.09467300027608871 - f1-score (micro avg) 0.7739
2023-10-17 17:06:09,364 saving best model
2023-10-17 17:06:09,696 ----------------------------------------------------------------------------------------------------
2023-10-17 17:06:16,645 epoch 2 - iter 144/1445 - loss 0.08876150 - time (sec): 6.95 - samples/sec: 2497.71 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:06:23,663 epoch 2 - iter 288/1445 - loss 0.09747449 - time (sec): 13.97 - samples/sec: 2491.11 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:06:30,614 epoch 2 - iter 432/1445 - loss 0.10608025 - time (sec): 20.92 - samples/sec: 2475.55 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:06:37,529 epoch 2 - iter 576/1445 - loss 0.10270502 - time (sec): 27.83 - samples/sec: 2482.27 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:06:44,828 epoch 2 - iter 720/1445 - loss 0.09945147 - time (sec): 35.13 - samples/sec: 2501.83 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:06:52,376 epoch 2 - iter 864/1445 - loss 0.09821080 - time (sec): 42.68 - samples/sec: 2521.14 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:06:59,408 epoch 2 - iter 1008/1445 - loss 0.09698307 - time (sec): 49.71 - samples/sec: 2519.44 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:07:06,366 epoch 2 - iter 1152/1445 - loss 0.09459084 - time (sec): 56.67 - samples/sec: 2507.92 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:07:13,360 epoch 2 - iter 1296/1445 - loss 0.11694060 - time (sec): 63.66 - samples/sec: 2492.28 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:07:20,514 epoch 2 - iter 1440/1445 - loss 0.11332756 - time (sec): 70.82 - samples/sec: 2478.69 - lr: 0.000044 - momentum: 0.000000
2023-10-17 17:07:20,745 ----------------------------------------------------------------------------------------------------
2023-10-17 17:07:20,745 EPOCH 2 done: loss 0.1132 - lr: 0.000044
2023-10-17 17:07:24,308 DEV : loss 0.10420767962932587 - f1-score (micro avg) 0.8009
2023-10-17 17:07:24,324 saving best model
2023-10-17 17:07:24,783 ----------------------------------------------------------------------------------------------------
2023-10-17 17:07:31,927 epoch 3 - iter 144/1445 - loss 0.08470755 - time (sec): 7.14 - samples/sec: 2435.17 - lr: 0.000044 - momentum: 0.000000
2023-10-17 17:07:39,197 epoch 3 - iter 288/1445 - loss 0.07396372 - time (sec): 14.41 - samples/sec: 2489.97 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:07:46,305 epoch 3 - iter 432/1445 - loss 0.07383465 - time (sec): 21.52 - samples/sec: 2520.38 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:07:53,470 epoch 3 - iter 576/1445 - loss 0.06831722 - time (sec): 28.68 - samples/sec: 2509.82 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:08:00,420 epoch 3 - iter 720/1445 - loss 0.06838521 - time (sec): 35.63 - samples/sec: 2481.69 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:08:07,496 epoch 3 - iter 864/1445 - loss 0.07099289 - time (sec): 42.71 - samples/sec: 2495.14 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:08:14,716 epoch 3 - iter 1008/1445 - loss 0.07256253 - time (sec): 49.93 - samples/sec: 2493.61 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:08:21,813 epoch 3 - iter 1152/1445 - loss 0.07227280 - time (sec): 57.02 - samples/sec: 2475.20 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:08:28,856 epoch 3 - iter 1296/1445 - loss 0.07185740 - time (sec): 64.07 - samples/sec: 2470.54 - lr: 0.000039 - momentum: 0.000000
2023-10-17 17:08:35,964 epoch 3 - iter 1440/1445 - loss 0.07400816 - time (sec): 71.18 - samples/sec: 2471.75 - lr: 0.000039 - momentum: 0.000000
2023-10-17 17:08:36,192 ----------------------------------------------------------------------------------------------------
2023-10-17 17:08:36,192 EPOCH 3 done: loss 0.0740 - lr: 0.000039
2023-10-17 17:08:39,376 DEV : loss 0.07721319794654846 - f1-score (micro avg) 0.8617
2023-10-17 17:08:39,393 saving best model
2023-10-17 17:08:39,828 ----------------------------------------------------------------------------------------------------
2023-10-17 17:08:46,820 epoch 4 - iter 144/1445 - loss 0.05757716 - time (sec): 6.99 - samples/sec: 2389.26 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:08:54,123 epoch 4 - iter 288/1445 - loss 0.05567864 - time (sec): 14.29 - samples/sec: 2423.78 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:09:01,229 epoch 4 - iter 432/1445 - loss 0.05284901 - time (sec): 21.40 - samples/sec: 2418.66 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:09:08,705 epoch 4 - iter 576/1445 - loss 0.05592908 - time (sec): 28.87 - samples/sec: 2397.87 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:09:15,661 epoch 4 - iter 720/1445 - loss 0.05517652 - time (sec): 35.83 - samples/sec: 2408.47 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:09:22,687 epoch 4 - iter 864/1445 - loss 0.05536967 - time (sec): 42.86 - samples/sec: 2424.10 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:09:29,787 epoch 4 - iter 1008/1445 - loss 0.05543565 - time (sec): 49.96 - samples/sec: 2437.22 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:09:37,137 epoch 4 - iter 1152/1445 - loss 0.05526822 - time (sec): 57.31 - samples/sec: 2459.94 - lr: 0.000034 - momentum: 0.000000
2023-10-17 17:09:44,260 epoch 4 - iter 1296/1445 - loss 0.05451781 - time (sec): 64.43 - samples/sec: 2451.52 - lr: 0.000034 - momentum: 0.000000
2023-10-17 17:09:51,253 epoch 4 - iter 1440/1445 - loss 0.05521212 - time (sec): 71.42 - samples/sec: 2460.69 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:09:51,478 ----------------------------------------------------------------------------------------------------
2023-10-17 17:09:51,478 EPOCH 4 done: loss 0.0554 - lr: 0.000033
2023-10-17 17:09:54,822 DEV : loss 0.1105174869298935 - f1-score (micro avg) 0.8481
2023-10-17 17:09:54,845 ----------------------------------------------------------------------------------------------------
2023-10-17 17:10:02,920 epoch 5 - iter 144/1445 - loss 0.03691405 - time (sec): 8.07 - samples/sec: 2224.21 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:10:10,045 epoch 5 - iter 288/1445 - loss 0.03176100 - time (sec): 15.20 - samples/sec: 2304.05 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:10:17,602 epoch 5 - iter 432/1445 - loss 0.03932322 - time (sec): 22.75 - samples/sec: 2329.78 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:10:24,977 epoch 5 - iter 576/1445 - loss 0.03671770 - time (sec): 30.13 - samples/sec: 2360.90 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:10:32,045 epoch 5 - iter 720/1445 - loss 0.03597991 - time (sec): 37.20 - samples/sec: 2364.69 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:10:39,156 epoch 5 - iter 864/1445 - loss 0.03800795 - time (sec): 44.31 - samples/sec: 2393.05 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:10:46,424 epoch 5 - iter 1008/1445 - loss 0.03888293 - time (sec): 51.58 - samples/sec: 2404.57 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:10:53,392 epoch 5 - iter 1152/1445 - loss 0.04005684 - time (sec): 58.55 - samples/sec: 2405.04 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:11:00,395 epoch 5 - iter 1296/1445 - loss 0.03964899 - time (sec): 65.55 - samples/sec: 2408.75 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:11:07,736 epoch 5 - iter 1440/1445 - loss 0.03902191 - time (sec): 72.89 - samples/sec: 2410.31 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:11:07,980 ----------------------------------------------------------------------------------------------------
2023-10-17 17:11:07,980 EPOCH 5 done: loss 0.0389 - lr: 0.000028
2023-10-17 17:11:11,203 DEV : loss 0.1348668783903122 - f1-score (micro avg) 0.8429
2023-10-17 17:11:11,220 ----------------------------------------------------------------------------------------------------
2023-10-17 17:11:18,313 epoch 6 - iter 144/1445 - loss 0.02667685 - time (sec): 7.09 - samples/sec: 2642.82 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:11:25,407 epoch 6 - iter 288/1445 - loss 0.02437191 - time (sec): 14.19 - samples/sec: 2520.07 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:11:32,591 epoch 6 - iter 432/1445 - loss 0.02572777 - time (sec): 21.37 - samples/sec: 2541.33 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:11:39,608 epoch 6 - iter 576/1445 - loss 0.02873206 - time (sec): 28.39 - samples/sec: 2556.68 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:11:46,265 epoch 6 - iter 720/1445 - loss 0.02986499 - time (sec): 35.04 - samples/sec: 2557.72 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:11:53,395 epoch 6 - iter 864/1445 - loss 0.02883648 - time (sec): 42.17 - samples/sec: 2553.96 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:12:00,334 epoch 6 - iter 1008/1445 - loss 0.02974928 - time (sec): 49.11 - samples/sec: 2533.59 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:12:07,379 epoch 6 - iter 1152/1445 - loss 0.03018321 - time (sec): 56.16 - samples/sec: 2528.82 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:12:14,362 epoch 6 - iter 1296/1445 - loss 0.02977708 - time (sec): 63.14 - samples/sec: 2512.31 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:12:21,234 epoch 6 - iter 1440/1445 - loss 0.02903161 - time (sec): 70.01 - samples/sec: 2510.34 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:12:21,461 ----------------------------------------------------------------------------------------------------
2023-10-17 17:12:21,461 EPOCH 6 done: loss 0.0290 - lr: 0.000022
2023-10-17 17:12:24,802 DEV : loss 0.1451694220304489 - f1-score (micro avg) 0.8613
2023-10-17 17:12:24,825 ----------------------------------------------------------------------------------------------------
2023-10-17 17:12:31,930 epoch 7 - iter 144/1445 - loss 0.02087668 - time (sec): 7.10 - samples/sec: 2417.93 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:12:39,148 epoch 7 - iter 288/1445 - loss 0.02185794 - time (sec): 14.32 - samples/sec: 2409.26 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:12:46,500 epoch 7 - iter 432/1445 - loss 0.02175364 - time (sec): 21.67 - samples/sec: 2411.87 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:12:53,739 epoch 7 - iter 576/1445 - loss 0.02252627 - time (sec): 28.91 - samples/sec: 2435.17 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:13:01,179 epoch 7 - iter 720/1445 - loss 0.02077423 - time (sec): 36.35 - samples/sec: 2411.62 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:13:08,331 epoch 7 - iter 864/1445 - loss 0.02175611 - time (sec): 43.50 - samples/sec: 2439.32 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:13:15,291 epoch 7 - iter 1008/1445 - loss 0.02053015 - time (sec): 50.46 - samples/sec: 2465.32 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:13:22,299 epoch 7 - iter 1152/1445 - loss 0.02076891 - time (sec): 57.47 - samples/sec: 2463.50 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:13:29,333 epoch 7 - iter 1296/1445 - loss 0.01993760 - time (sec): 64.51 - samples/sec: 2453.37 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:13:36,373 epoch 7 - iter 1440/1445 - loss 0.01965177 - time (sec): 71.55 - samples/sec: 2451.17 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:13:36,727 ----------------------------------------------------------------------------------------------------
2023-10-17 17:13:36,727 EPOCH 7 done: loss 0.0196 - lr: 0.000017
2023-10-17 17:13:39,963 DEV : loss 0.16489093005657196 - f1-score (micro avg) 0.8464
2023-10-17 17:13:39,982 ----------------------------------------------------------------------------------------------------
2023-10-17 17:13:46,885 epoch 8 - iter 144/1445 - loss 0.01306226 - time (sec): 6.90 - samples/sec: 2419.95 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:13:54,119 epoch 8 - iter 288/1445 - loss 0.01102787 - time (sec): 14.14 - samples/sec: 2401.83 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:14:01,225 epoch 8 - iter 432/1445 - loss 0.01344659 - time (sec): 21.24 - samples/sec: 2410.46 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:14:08,409 epoch 8 - iter 576/1445 - loss 0.01269442 - time (sec): 28.43 - samples/sec: 2425.31 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:14:15,556 epoch 8 - iter 720/1445 - loss 0.01377224 - time (sec): 35.57 - samples/sec: 2433.65 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:14:22,489 epoch 8 - iter 864/1445 - loss 0.01260199 - time (sec): 42.51 - samples/sec: 2444.68 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:14:29,499 epoch 8 - iter 1008/1445 - loss 0.01260952 - time (sec): 49.52 - samples/sec: 2461.06 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:14:36,438 epoch 8 - iter 1152/1445 - loss 0.01357834 - time (sec): 56.46 - samples/sec: 2471.67 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:14:43,638 epoch 8 - iter 1296/1445 - loss 0.01319581 - time (sec): 63.65 - samples/sec: 2498.23 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:14:50,513 epoch 8 - iter 1440/1445 - loss 0.01282748 - time (sec): 70.53 - samples/sec: 2489.90 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:14:50,738 ----------------------------------------------------------------------------------------------------
2023-10-17 17:14:50,738 EPOCH 8 done: loss 0.0128 - lr: 0.000011
2023-10-17 17:14:53,961 DEV : loss 0.1513351947069168 - f1-score (micro avg) 0.8595
2023-10-17 17:14:53,979 ----------------------------------------------------------------------------------------------------
2023-10-17 17:15:00,976 epoch 9 - iter 144/1445 - loss 0.01099181 - time (sec): 7.00 - samples/sec: 2449.62 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:15:08,090 epoch 9 - iter 288/1445 - loss 0.01019344 - time (sec): 14.11 - samples/sec: 2481.99 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:15:15,251 epoch 9 - iter 432/1445 - loss 0.00967362 - time (sec): 21.27 - samples/sec: 2468.37 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:15:22,506 epoch 9 - iter 576/1445 - loss 0.01035916 - time (sec): 28.53 - samples/sec: 2485.88 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:15:29,912 epoch 9 - iter 720/1445 - loss 0.01031149 - time (sec): 35.93 - samples/sec: 2484.21 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:15:36,953 epoch 9 - iter 864/1445 - loss 0.01001435 - time (sec): 42.97 - samples/sec: 2486.28 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:15:43,984 epoch 9 - iter 1008/1445 - loss 0.00920076 - time (sec): 50.00 - samples/sec: 2468.52 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:15:50,879 epoch 9 - iter 1152/1445 - loss 0.00875804 - time (sec): 56.90 - samples/sec: 2453.39 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:15:57,895 epoch 9 - iter 1296/1445 - loss 0.00852447 - time (sec): 63.91 - samples/sec: 2472.25 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:16:04,969 epoch 9 - iter 1440/1445 - loss 0.00877198 - time (sec): 70.99 - samples/sec: 2474.67 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:16:05,199 ----------------------------------------------------------------------------------------------------
2023-10-17 17:16:05,199 EPOCH 9 done: loss 0.0087 - lr: 0.000006
2023-10-17 17:16:08,771 DEV : loss 0.16765910387039185 - f1-score (micro avg) 0.8544
2023-10-17 17:16:08,787 ----------------------------------------------------------------------------------------------------
2023-10-17 17:16:15,817 epoch 10 - iter 144/1445 - loss 0.00253582 - time (sec): 7.03 - samples/sec: 2563.59 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:16:22,833 epoch 10 - iter 288/1445 - loss 0.00331327 - time (sec): 14.04 - samples/sec: 2473.08 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:16:30,060 epoch 10 - iter 432/1445 - loss 0.00410042 - time (sec): 21.27 - samples/sec: 2497.27 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:16:37,212 epoch 10 - iter 576/1445 - loss 0.00431306 - time (sec): 28.42 - samples/sec: 2504.09 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:16:44,188 epoch 10 - iter 720/1445 - loss 0.00416941 - time (sec): 35.40 - samples/sec: 2507.58 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:16:51,345 epoch 10 - iter 864/1445 - loss 0.00456723 - time (sec): 42.56 - samples/sec: 2499.03 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:16:58,231 epoch 10 - iter 1008/1445 - loss 0.00492998 - time (sec): 49.44 - samples/sec: 2497.08 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:17:05,259 epoch 10 - iter 1152/1445 - loss 0.00486200 - time (sec): 56.47 - samples/sec: 2496.94 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:17:12,327 epoch 10 - iter 1296/1445 - loss 0.00520572 - time (sec): 63.54 - samples/sec: 2500.58 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:17:19,331 epoch 10 - iter 1440/1445 - loss 0.00554479 - time (sec): 70.54 - samples/sec: 2492.72 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:17:19,547 ----------------------------------------------------------------------------------------------------
2023-10-17 17:17:19,547 EPOCH 10 done: loss 0.0055 - lr: 0.000000
2023-10-17 17:17:22,759 DEV : loss 0.1753772795200348 - f1-score (micro avg) 0.8485
2023-10-17 17:17:23,144 ----------------------------------------------------------------------------------------------------
2023-10-17 17:17:23,145 Loading model from best epoch ...
2023-10-17 17:17:24,487 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 17:17:27,289
Results:
- F-score (micro) 0.8245
- F-score (macro) 0.7277
- Accuracy 0.7118
By class:
precision recall f1-score support
PER 0.8133 0.8402 0.8265 482
LOC 0.9221 0.8275 0.8723 458
ORG 0.5254 0.4493 0.4844 69
micro avg 0.8419 0.8077 0.8245 1009
macro avg 0.7536 0.7057 0.7277 1009
weighted avg 0.8430 0.8077 0.8239 1009
2023-10-17 17:17:27,289 ----------------------------------------------------------------------------------------------------