stefan-it's picture
Upload folder using huggingface_hub
68a732e
2023-10-17 16:51:50,595 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,596 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 16:51:50,596 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,596 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 16:51:50,596 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,596 Train: 5777 sentences
2023-10-17 16:51:50,596 (train_with_dev=False, train_with_test=False)
2023-10-17 16:51:50,596 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,596 Training Params:
2023-10-17 16:51:50,596 - learning_rate: "3e-05"
2023-10-17 16:51:50,596 - mini_batch_size: "4"
2023-10-17 16:51:50,596 - max_epochs: "10"
2023-10-17 16:51:50,596 - shuffle: "True"
2023-10-17 16:51:50,596 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,596 Plugins:
2023-10-17 16:51:50,596 - TensorboardLogger
2023-10-17 16:51:50,596 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 16:51:50,596 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,596 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 16:51:50,596 - metric: "('micro avg', 'f1-score')"
2023-10-17 16:51:50,596 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,596 Computation:
2023-10-17 16:51:50,596 - compute on device: cuda:0
2023-10-17 16:51:50,596 - embedding storage: none
2023-10-17 16:51:50,596 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,596 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 16:51:50,596 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,597 ----------------------------------------------------------------------------------------------------
2023-10-17 16:51:50,597 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 16:51:57,734 epoch 1 - iter 144/1445 - loss 2.51964113 - time (sec): 7.14 - samples/sec: 2407.06 - lr: 0.000003 - momentum: 0.000000
2023-10-17 16:52:04,801 epoch 1 - iter 288/1445 - loss 1.46886525 - time (sec): 14.20 - samples/sec: 2396.44 - lr: 0.000006 - momentum: 0.000000
2023-10-17 16:52:11,824 epoch 1 - iter 432/1445 - loss 1.03580590 - time (sec): 21.23 - samples/sec: 2436.57 - lr: 0.000009 - momentum: 0.000000
2023-10-17 16:52:18,859 epoch 1 - iter 576/1445 - loss 0.82263272 - time (sec): 28.26 - samples/sec: 2459.21 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:52:26,087 epoch 1 - iter 720/1445 - loss 0.67840038 - time (sec): 35.49 - samples/sec: 2485.64 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:52:33,197 epoch 1 - iter 864/1445 - loss 0.58315275 - time (sec): 42.60 - samples/sec: 2500.11 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:52:40,137 epoch 1 - iter 1008/1445 - loss 0.51794596 - time (sec): 49.54 - samples/sec: 2498.80 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:52:47,253 epoch 1 - iter 1152/1445 - loss 0.46799546 - time (sec): 56.66 - samples/sec: 2495.94 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:52:54,086 epoch 1 - iter 1296/1445 - loss 0.43471267 - time (sec): 63.49 - samples/sec: 2473.24 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:53:01,099 epoch 1 - iter 1440/1445 - loss 0.40016424 - time (sec): 70.50 - samples/sec: 2488.40 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:53:01,370 ----------------------------------------------------------------------------------------------------
2023-10-17 16:53:01,371 EPOCH 1 done: loss 0.3987 - lr: 0.000030
2023-10-17 16:53:03,993 DEV : loss 0.12171746790409088 - f1-score (micro avg) 0.7585
2023-10-17 16:53:04,008 saving best model
2023-10-17 16:53:04,339 ----------------------------------------------------------------------------------------------------
2023-10-17 16:53:11,206 epoch 2 - iter 144/1445 - loss 0.09547817 - time (sec): 6.87 - samples/sec: 2527.57 - lr: 0.000030 - momentum: 0.000000
2023-10-17 16:53:18,194 epoch 2 - iter 288/1445 - loss 0.10207718 - time (sec): 13.85 - samples/sec: 2511.05 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:53:25,162 epoch 2 - iter 432/1445 - loss 0.10574612 - time (sec): 20.82 - samples/sec: 2486.76 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:53:32,148 epoch 2 - iter 576/1445 - loss 0.10231456 - time (sec): 27.81 - samples/sec: 2484.41 - lr: 0.000029 - momentum: 0.000000
2023-10-17 16:53:39,311 epoch 2 - iter 720/1445 - loss 0.09986497 - time (sec): 34.97 - samples/sec: 2513.27 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:53:46,749 epoch 2 - iter 864/1445 - loss 0.09794746 - time (sec): 42.41 - samples/sec: 2537.18 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:53:53,859 epoch 2 - iter 1008/1445 - loss 0.09611171 - time (sec): 49.52 - samples/sec: 2529.21 - lr: 0.000028 - momentum: 0.000000
2023-10-17 16:54:01,602 epoch 2 - iter 1152/1445 - loss 0.09333721 - time (sec): 57.26 - samples/sec: 2481.95 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:54:09,733 epoch 2 - iter 1296/1445 - loss 0.09231752 - time (sec): 65.39 - samples/sec: 2426.32 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:54:17,227 epoch 2 - iter 1440/1445 - loss 0.09139074 - time (sec): 72.89 - samples/sec: 2408.27 - lr: 0.000027 - momentum: 0.000000
2023-10-17 16:54:17,477 ----------------------------------------------------------------------------------------------------
2023-10-17 16:54:17,478 EPOCH 2 done: loss 0.0913 - lr: 0.000027
2023-10-17 16:54:21,098 DEV : loss 0.08095023036003113 - f1-score (micro avg) 0.8018
2023-10-17 16:54:21,115 saving best model
2023-10-17 16:54:21,557 ----------------------------------------------------------------------------------------------------
2023-10-17 16:54:28,597 epoch 3 - iter 144/1445 - loss 0.07907593 - time (sec): 7.03 - samples/sec: 2471.23 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:54:35,963 epoch 3 - iter 288/1445 - loss 0.06930842 - time (sec): 14.40 - samples/sec: 2491.47 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:54:42,970 epoch 3 - iter 432/1445 - loss 0.06547744 - time (sec): 21.41 - samples/sec: 2533.28 - lr: 0.000026 - momentum: 0.000000
2023-10-17 16:54:49,929 epoch 3 - iter 576/1445 - loss 0.06189547 - time (sec): 28.37 - samples/sec: 2537.73 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:54:56,792 epoch 3 - iter 720/1445 - loss 0.06117221 - time (sec): 35.23 - samples/sec: 2510.04 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:55:04,122 epoch 3 - iter 864/1445 - loss 0.06199025 - time (sec): 42.56 - samples/sec: 2503.89 - lr: 0.000025 - momentum: 0.000000
2023-10-17 16:55:11,617 epoch 3 - iter 1008/1445 - loss 0.06500361 - time (sec): 50.05 - samples/sec: 2487.30 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:55:18,545 epoch 3 - iter 1152/1445 - loss 0.06477948 - time (sec): 56.98 - samples/sec: 2477.01 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:55:25,742 epoch 3 - iter 1296/1445 - loss 0.06476144 - time (sec): 64.18 - samples/sec: 2466.25 - lr: 0.000024 - momentum: 0.000000
2023-10-17 16:55:32,997 epoch 3 - iter 1440/1445 - loss 0.06672483 - time (sec): 71.43 - samples/sec: 2462.82 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:55:33,220 ----------------------------------------------------------------------------------------------------
2023-10-17 16:55:33,221 EPOCH 3 done: loss 0.0667 - lr: 0.000023
2023-10-17 16:55:36,446 DEV : loss 0.07555373758077621 - f1-score (micro avg) 0.8639
2023-10-17 16:55:36,464 saving best model
2023-10-17 16:55:36,902 ----------------------------------------------------------------------------------------------------
2023-10-17 16:55:43,816 epoch 4 - iter 144/1445 - loss 0.04753253 - time (sec): 6.91 - samples/sec: 2416.67 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:55:51,351 epoch 4 - iter 288/1445 - loss 0.04514475 - time (sec): 14.44 - samples/sec: 2398.27 - lr: 0.000023 - momentum: 0.000000
2023-10-17 16:55:58,535 epoch 4 - iter 432/1445 - loss 0.04852100 - time (sec): 21.63 - samples/sec: 2392.92 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:56:05,540 epoch 4 - iter 576/1445 - loss 0.05156722 - time (sec): 28.63 - samples/sec: 2418.03 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:56:12,372 epoch 4 - iter 720/1445 - loss 0.05036006 - time (sec): 35.47 - samples/sec: 2433.19 - lr: 0.000022 - momentum: 0.000000
2023-10-17 16:56:19,508 epoch 4 - iter 864/1445 - loss 0.05094762 - time (sec): 42.60 - samples/sec: 2438.61 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:56:26,524 epoch 4 - iter 1008/1445 - loss 0.05000499 - time (sec): 49.62 - samples/sec: 2453.87 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:56:33,720 epoch 4 - iter 1152/1445 - loss 0.05036806 - time (sec): 56.81 - samples/sec: 2481.30 - lr: 0.000021 - momentum: 0.000000
2023-10-17 16:56:40,708 epoch 4 - iter 1296/1445 - loss 0.04919696 - time (sec): 63.80 - samples/sec: 2475.63 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:56:47,668 epoch 4 - iter 1440/1445 - loss 0.04959858 - time (sec): 70.76 - samples/sec: 2483.70 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:56:47,903 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:47,903 EPOCH 4 done: loss 0.0496 - lr: 0.000020
2023-10-17 16:56:51,737 DEV : loss 0.08526481688022614 - f1-score (micro avg) 0.8661
2023-10-17 16:56:51,757 saving best model
2023-10-17 16:56:52,198 ----------------------------------------------------------------------------------------------------
2023-10-17 16:56:59,878 epoch 5 - iter 144/1445 - loss 0.02983267 - time (sec): 7.68 - samples/sec: 2338.64 - lr: 0.000020 - momentum: 0.000000
2023-10-17 16:57:06,651 epoch 5 - iter 288/1445 - loss 0.02587426 - time (sec): 14.45 - samples/sec: 2423.26 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:57:13,811 epoch 5 - iter 432/1445 - loss 0.03178345 - time (sec): 21.61 - samples/sec: 2453.21 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:57:20,904 epoch 5 - iter 576/1445 - loss 0.03049647 - time (sec): 28.70 - samples/sec: 2478.29 - lr: 0.000019 - momentum: 0.000000
2023-10-17 16:57:27,751 epoch 5 - iter 720/1445 - loss 0.02973021 - time (sec): 35.55 - samples/sec: 2474.24 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:57:34,537 epoch 5 - iter 864/1445 - loss 0.03342990 - time (sec): 42.34 - samples/sec: 2504.53 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:57:41,369 epoch 5 - iter 1008/1445 - loss 0.03551324 - time (sec): 49.17 - samples/sec: 2522.36 - lr: 0.000018 - momentum: 0.000000
2023-10-17 16:57:48,235 epoch 5 - iter 1152/1445 - loss 0.03569568 - time (sec): 56.03 - samples/sec: 2512.81 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:57:55,333 epoch 5 - iter 1296/1445 - loss 0.03592580 - time (sec): 63.13 - samples/sec: 2500.90 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:58:03,284 epoch 5 - iter 1440/1445 - loss 0.03540825 - time (sec): 71.08 - samples/sec: 2471.52 - lr: 0.000017 - momentum: 0.000000
2023-10-17 16:58:03,556 ----------------------------------------------------------------------------------------------------
2023-10-17 16:58:03,556 EPOCH 5 done: loss 0.0353 - lr: 0.000017
2023-10-17 16:58:06,848 DEV : loss 0.12203659117221832 - f1-score (micro avg) 0.831
2023-10-17 16:58:06,868 ----------------------------------------------------------------------------------------------------
2023-10-17 16:58:14,955 epoch 6 - iter 144/1445 - loss 0.02842518 - time (sec): 8.09 - samples/sec: 2318.05 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:58:22,926 epoch 6 - iter 288/1445 - loss 0.02055464 - time (sec): 16.06 - samples/sec: 2226.37 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:58:31,227 epoch 6 - iter 432/1445 - loss 0.02489207 - time (sec): 24.36 - samples/sec: 2229.50 - lr: 0.000016 - momentum: 0.000000
2023-10-17 16:58:38,990 epoch 6 - iter 576/1445 - loss 0.02645244 - time (sec): 32.12 - samples/sec: 2259.49 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:58:45,907 epoch 6 - iter 720/1445 - loss 0.02618404 - time (sec): 39.04 - samples/sec: 2296.06 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:58:52,910 epoch 6 - iter 864/1445 - loss 0.02602900 - time (sec): 46.04 - samples/sec: 2339.43 - lr: 0.000015 - momentum: 0.000000
2023-10-17 16:58:59,842 epoch 6 - iter 1008/1445 - loss 0.02750063 - time (sec): 52.97 - samples/sec: 2349.00 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:59:06,963 epoch 6 - iter 1152/1445 - loss 0.02837136 - time (sec): 60.09 - samples/sec: 2363.19 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:59:13,986 epoch 6 - iter 1296/1445 - loss 0.02684921 - time (sec): 67.12 - samples/sec: 2363.48 - lr: 0.000014 - momentum: 0.000000
2023-10-17 16:59:20,726 epoch 6 - iter 1440/1445 - loss 0.02655197 - time (sec): 73.86 - samples/sec: 2379.68 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:59:20,957 ----------------------------------------------------------------------------------------------------
2023-10-17 16:59:20,957 EPOCH 6 done: loss 0.0265 - lr: 0.000013
2023-10-17 16:59:24,232 DEV : loss 0.1185513511300087 - f1-score (micro avg) 0.8581
2023-10-17 16:59:24,251 ----------------------------------------------------------------------------------------------------
2023-10-17 16:59:31,348 epoch 7 - iter 144/1445 - loss 0.01592139 - time (sec): 7.10 - samples/sec: 2420.48 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:59:38,719 epoch 7 - iter 288/1445 - loss 0.01953479 - time (sec): 14.47 - samples/sec: 2385.00 - lr: 0.000013 - momentum: 0.000000
2023-10-17 16:59:45,900 epoch 7 - iter 432/1445 - loss 0.01960054 - time (sec): 21.65 - samples/sec: 2414.78 - lr: 0.000012 - momentum: 0.000000
2023-10-17 16:59:53,535 epoch 7 - iter 576/1445 - loss 0.02108098 - time (sec): 29.28 - samples/sec: 2404.40 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:00:00,596 epoch 7 - iter 720/1445 - loss 0.01990454 - time (sec): 36.34 - samples/sec: 2412.16 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:00:07,764 epoch 7 - iter 864/1445 - loss 0.02048029 - time (sec): 43.51 - samples/sec: 2438.92 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:00:14,808 epoch 7 - iter 1008/1445 - loss 0.01953974 - time (sec): 50.56 - samples/sec: 2460.86 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:00:21,791 epoch 7 - iter 1152/1445 - loss 0.01979941 - time (sec): 57.54 - samples/sec: 2460.68 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:00:28,760 epoch 7 - iter 1296/1445 - loss 0.01839308 - time (sec): 64.51 - samples/sec: 2453.29 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:00:35,846 epoch 7 - iter 1440/1445 - loss 0.01795360 - time (sec): 71.59 - samples/sec: 2449.53 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:00:36,194 ----------------------------------------------------------------------------------------------------
2023-10-17 17:00:36,194 EPOCH 7 done: loss 0.0180 - lr: 0.000010
2023-10-17 17:00:39,469 DEV : loss 0.12616300582885742 - f1-score (micro avg) 0.8691
2023-10-17 17:00:39,485 saving best model
2023-10-17 17:00:39,951 ----------------------------------------------------------------------------------------------------
2023-10-17 17:00:46,624 epoch 8 - iter 144/1445 - loss 0.01689844 - time (sec): 6.67 - samples/sec: 2504.32 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:00:53,857 epoch 8 - iter 288/1445 - loss 0.01100211 - time (sec): 13.90 - samples/sec: 2442.08 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:01:00,829 epoch 8 - iter 432/1445 - loss 0.01437202 - time (sec): 20.88 - samples/sec: 2452.85 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:01:07,822 epoch 8 - iter 576/1445 - loss 0.01241985 - time (sec): 27.87 - samples/sec: 2473.87 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:01:15,067 epoch 8 - iter 720/1445 - loss 0.01363731 - time (sec): 35.11 - samples/sec: 2465.52 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:01:22,117 epoch 8 - iter 864/1445 - loss 0.01368516 - time (sec): 42.16 - samples/sec: 2464.53 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:01:29,232 epoch 8 - iter 1008/1445 - loss 0.01357477 - time (sec): 49.28 - samples/sec: 2472.94 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:01:36,222 epoch 8 - iter 1152/1445 - loss 0.01410465 - time (sec): 56.27 - samples/sec: 2479.90 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:01:43,482 epoch 8 - iter 1296/1445 - loss 0.01364676 - time (sec): 63.53 - samples/sec: 2503.19 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:01:50,427 epoch 8 - iter 1440/1445 - loss 0.01339047 - time (sec): 70.47 - samples/sec: 2491.94 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:01:50,657 ----------------------------------------------------------------------------------------------------
2023-10-17 17:01:50,657 EPOCH 8 done: loss 0.0134 - lr: 0.000007
2023-10-17 17:01:53,972 DEV : loss 0.12705564498901367 - f1-score (micro avg) 0.872
2023-10-17 17:01:53,990 saving best model
2023-10-17 17:01:54,439 ----------------------------------------------------------------------------------------------------
2023-10-17 17:02:01,517 epoch 9 - iter 144/1445 - loss 0.00381457 - time (sec): 7.08 - samples/sec: 2421.87 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:02:08,664 epoch 9 - iter 288/1445 - loss 0.00734927 - time (sec): 14.22 - samples/sec: 2462.26 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:02:15,773 epoch 9 - iter 432/1445 - loss 0.00735271 - time (sec): 21.33 - samples/sec: 2461.35 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:02:23,175 epoch 9 - iter 576/1445 - loss 0.00769310 - time (sec): 28.73 - samples/sec: 2467.86 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:02:30,572 epoch 9 - iter 720/1445 - loss 0.00864776 - time (sec): 36.13 - samples/sec: 2470.49 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:02:38,094 epoch 9 - iter 864/1445 - loss 0.00916627 - time (sec): 43.65 - samples/sec: 2447.52 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:02:45,162 epoch 9 - iter 1008/1445 - loss 0.00973331 - time (sec): 50.72 - samples/sec: 2433.57 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:02:51,976 epoch 9 - iter 1152/1445 - loss 0.00925598 - time (sec): 57.53 - samples/sec: 2426.27 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:02:59,048 epoch 9 - iter 1296/1445 - loss 0.00914846 - time (sec): 64.61 - samples/sec: 2445.75 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:03:06,122 epoch 9 - iter 1440/1445 - loss 0.00948694 - time (sec): 71.68 - samples/sec: 2450.76 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:03:06,346 ----------------------------------------------------------------------------------------------------
2023-10-17 17:03:06,347 EPOCH 9 done: loss 0.0095 - lr: 0.000003
2023-10-17 17:03:09,701 DEV : loss 0.14085045456886292 - f1-score (micro avg) 0.8698
2023-10-17 17:03:09,724 ----------------------------------------------------------------------------------------------------
2023-10-17 17:03:16,841 epoch 10 - iter 144/1445 - loss 0.00996398 - time (sec): 7.12 - samples/sec: 2532.12 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:03:23,906 epoch 10 - iter 288/1445 - loss 0.00789760 - time (sec): 14.18 - samples/sec: 2449.41 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:03:31,192 epoch 10 - iter 432/1445 - loss 0.00862420 - time (sec): 21.47 - samples/sec: 2474.56 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:03:38,308 epoch 10 - iter 576/1445 - loss 0.00751122 - time (sec): 28.58 - samples/sec: 2490.18 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:03:45,506 epoch 10 - iter 720/1445 - loss 0.00674528 - time (sec): 35.78 - samples/sec: 2480.89 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:03:52,464 epoch 10 - iter 864/1445 - loss 0.00666326 - time (sec): 42.74 - samples/sec: 2488.41 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:03:59,509 epoch 10 - iter 1008/1445 - loss 0.00638795 - time (sec): 49.78 - samples/sec: 2479.99 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:04:06,580 epoch 10 - iter 1152/1445 - loss 0.00631451 - time (sec): 56.85 - samples/sec: 2480.07 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:04:13,723 epoch 10 - iter 1296/1445 - loss 0.00652823 - time (sec): 64.00 - samples/sec: 2482.67 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:04:20,891 epoch 10 - iter 1440/1445 - loss 0.00649172 - time (sec): 71.17 - samples/sec: 2470.94 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:04:21,133 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:21,134 EPOCH 10 done: loss 0.0065 - lr: 0.000000
2023-10-17 17:04:24,372 DEV : loss 0.1461455523967743 - f1-score (micro avg) 0.8679
2023-10-17 17:04:24,745 ----------------------------------------------------------------------------------------------------
2023-10-17 17:04:24,746 Loading model from best epoch ...
2023-10-17 17:04:26,084 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 17:04:28,840
Results:
- F-score (micro) 0.8773
- F-score (macro) 0.7854
- Accuracy 0.7879
By class:
precision recall f1-score support
PER 0.8737 0.8755 0.8746 482
LOC 0.9603 0.8974 0.9278 458
ORG 0.5902 0.5217 0.5538 69
micro avg 0.8940 0.8612 0.8773 1009
macro avg 0.8081 0.7649 0.7854 1009
weighted avg 0.8936 0.8612 0.8768 1009
2023-10-17 17:04:28,841 ----------------------------------------------------------------------------------------------------