stefan-it's picture
Upload folder using huggingface_hub
38d2a3f
2023-10-17 17:38:15,449 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,450 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:38:15,450 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,450 MultiCorpus: 5777 train + 722 dev + 723 test sentences
- NER_ICDAR_EUROPEANA Corpus: 5777 train + 722 dev + 723 test sentences - /root/.flair/datasets/ner_icdar_europeana/nl
2023-10-17 17:38:15,451 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,451 Train: 5777 sentences
2023-10-17 17:38:15,451 (train_with_dev=False, train_with_test=False)
2023-10-17 17:38:15,451 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,451 Training Params:
2023-10-17 17:38:15,451 - learning_rate: "3e-05"
2023-10-17 17:38:15,451 - mini_batch_size: "4"
2023-10-17 17:38:15,451 - max_epochs: "10"
2023-10-17 17:38:15,451 - shuffle: "True"
2023-10-17 17:38:15,451 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,451 Plugins:
2023-10-17 17:38:15,451 - TensorboardLogger
2023-10-17 17:38:15,451 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:38:15,451 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,451 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:38:15,451 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:38:15,451 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,451 Computation:
2023-10-17 17:38:15,451 - compute on device: cuda:0
2023-10-17 17:38:15,451 - embedding storage: none
2023-10-17 17:38:15,451 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,451 Model training base path: "hmbench-icdar/nl-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4"
2023-10-17 17:38:15,451 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,451 ----------------------------------------------------------------------------------------------------
2023-10-17 17:38:15,451 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:38:22,526 epoch 1 - iter 144/1445 - loss 2.78078658 - time (sec): 7.07 - samples/sec: 2289.63 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:38:29,934 epoch 1 - iter 288/1445 - loss 1.51861606 - time (sec): 14.48 - samples/sec: 2361.02 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:38:37,086 epoch 1 - iter 432/1445 - loss 1.07913069 - time (sec): 21.63 - samples/sec: 2389.34 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:38:44,314 epoch 1 - iter 576/1445 - loss 0.85096615 - time (sec): 28.86 - samples/sec: 2396.22 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:38:51,463 epoch 1 - iter 720/1445 - loss 0.70608937 - time (sec): 36.01 - samples/sec: 2419.98 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:38:58,764 epoch 1 - iter 864/1445 - loss 0.60898489 - time (sec): 43.31 - samples/sec: 2437.90 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:39:06,108 epoch 1 - iter 1008/1445 - loss 0.53851173 - time (sec): 50.66 - samples/sec: 2434.66 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:39:13,470 epoch 1 - iter 1152/1445 - loss 0.48695757 - time (sec): 58.02 - samples/sec: 2434.01 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:39:20,969 epoch 1 - iter 1296/1445 - loss 0.44708419 - time (sec): 65.52 - samples/sec: 2427.59 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:39:27,888 epoch 1 - iter 1440/1445 - loss 0.41563598 - time (sec): 72.44 - samples/sec: 2424.95 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:39:28,119 ----------------------------------------------------------------------------------------------------
2023-10-17 17:39:28,120 EPOCH 1 done: loss 0.4146 - lr: 0.000030
2023-10-17 17:39:31,097 DEV : loss 0.10123064368963242 - f1-score (micro avg) 0.6885
2023-10-17 17:39:31,119 saving best model
2023-10-17 17:39:31,520 ----------------------------------------------------------------------------------------------------
2023-10-17 17:39:38,950 epoch 2 - iter 144/1445 - loss 0.09208746 - time (sec): 7.43 - samples/sec: 2505.79 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:39:46,043 epoch 2 - iter 288/1445 - loss 0.09395078 - time (sec): 14.52 - samples/sec: 2443.88 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:39:53,806 epoch 2 - iter 432/1445 - loss 0.09675004 - time (sec): 22.28 - samples/sec: 2402.53 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:40:01,000 epoch 2 - iter 576/1445 - loss 0.09339561 - time (sec): 29.48 - samples/sec: 2413.55 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:40:08,104 epoch 2 - iter 720/1445 - loss 0.09667081 - time (sec): 36.58 - samples/sec: 2401.10 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:40:15,323 epoch 2 - iter 864/1445 - loss 0.09827203 - time (sec): 43.80 - samples/sec: 2384.73 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:40:23,213 epoch 2 - iter 1008/1445 - loss 0.09597707 - time (sec): 51.69 - samples/sec: 2384.96 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:40:30,320 epoch 2 - iter 1152/1445 - loss 0.09430318 - time (sec): 58.80 - samples/sec: 2373.58 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:40:37,393 epoch 2 - iter 1296/1445 - loss 0.09280542 - time (sec): 65.87 - samples/sec: 2384.11 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:40:44,895 epoch 2 - iter 1440/1445 - loss 0.09043549 - time (sec): 73.37 - samples/sec: 2395.99 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:40:45,128 ----------------------------------------------------------------------------------------------------
2023-10-17 17:40:45,129 EPOCH 2 done: loss 0.0906 - lr: 0.000027
2023-10-17 17:40:48,972 DEV : loss 0.08994048088788986 - f1-score (micro avg) 0.8041
2023-10-17 17:40:48,997 saving best model
2023-10-17 17:40:49,509 ----------------------------------------------------------------------------------------------------
2023-10-17 17:40:56,927 epoch 3 - iter 144/1445 - loss 0.05872201 - time (sec): 7.41 - samples/sec: 2360.16 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:41:03,913 epoch 3 - iter 288/1445 - loss 0.05857562 - time (sec): 14.40 - samples/sec: 2420.88 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:41:11,051 epoch 3 - iter 432/1445 - loss 0.06614001 - time (sec): 21.53 - samples/sec: 2394.73 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:41:18,016 epoch 3 - iter 576/1445 - loss 0.06718519 - time (sec): 28.50 - samples/sec: 2399.07 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:41:25,276 epoch 3 - iter 720/1445 - loss 0.06571201 - time (sec): 35.76 - samples/sec: 2412.78 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:41:32,934 epoch 3 - iter 864/1445 - loss 0.06765137 - time (sec): 43.42 - samples/sec: 2442.62 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:41:40,390 epoch 3 - iter 1008/1445 - loss 0.06716439 - time (sec): 50.87 - samples/sec: 2446.41 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:41:47,590 epoch 3 - iter 1152/1445 - loss 0.06595038 - time (sec): 58.07 - samples/sec: 2439.72 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:41:54,737 epoch 3 - iter 1296/1445 - loss 0.06557252 - time (sec): 65.22 - samples/sec: 2434.72 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:42:01,793 epoch 3 - iter 1440/1445 - loss 0.06558181 - time (sec): 72.28 - samples/sec: 2428.06 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:42:02,065 ----------------------------------------------------------------------------------------------------
2023-10-17 17:42:02,065 EPOCH 3 done: loss 0.0655 - lr: 0.000023
2023-10-17 17:42:05,446 DEV : loss 0.08329488337039948 - f1-score (micro avg) 0.8576
2023-10-17 17:42:05,463 saving best model
2023-10-17 17:42:05,984 ----------------------------------------------------------------------------------------------------
2023-10-17 17:42:13,079 epoch 4 - iter 144/1445 - loss 0.03908482 - time (sec): 7.09 - samples/sec: 2434.08 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:42:20,247 epoch 4 - iter 288/1445 - loss 0.04314761 - time (sec): 14.26 - samples/sec: 2426.27 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:42:27,325 epoch 4 - iter 432/1445 - loss 0.04479377 - time (sec): 21.34 - samples/sec: 2452.70 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:42:34,473 epoch 4 - iter 576/1445 - loss 0.04639701 - time (sec): 28.49 - samples/sec: 2443.28 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:42:41,304 epoch 4 - iter 720/1445 - loss 0.04663083 - time (sec): 35.32 - samples/sec: 2432.99 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:42:48,590 epoch 4 - iter 864/1445 - loss 0.04877294 - time (sec): 42.60 - samples/sec: 2446.39 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:42:56,157 epoch 4 - iter 1008/1445 - loss 0.05120701 - time (sec): 50.17 - samples/sec: 2442.48 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:43:03,410 epoch 4 - iter 1152/1445 - loss 0.05143852 - time (sec): 57.42 - samples/sec: 2428.03 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:43:11,041 epoch 4 - iter 1296/1445 - loss 0.05077155 - time (sec): 65.05 - samples/sec: 2416.48 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:43:18,170 epoch 4 - iter 1440/1445 - loss 0.05272378 - time (sec): 72.18 - samples/sec: 2432.26 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:43:18,411 ----------------------------------------------------------------------------------------------------
2023-10-17 17:43:18,411 EPOCH 4 done: loss 0.0528 - lr: 0.000020
2023-10-17 17:43:22,251 DEV : loss 0.09850851446390152 - f1-score (micro avg) 0.8597
2023-10-17 17:43:22,267 saving best model
2023-10-17 17:43:22,711 ----------------------------------------------------------------------------------------------------
2023-10-17 17:43:29,952 epoch 5 - iter 144/1445 - loss 0.02913580 - time (sec): 7.24 - samples/sec: 2455.71 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:43:37,014 epoch 5 - iter 288/1445 - loss 0.03254544 - time (sec): 14.30 - samples/sec: 2472.75 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:43:44,071 epoch 5 - iter 432/1445 - loss 0.03570376 - time (sec): 21.36 - samples/sec: 2430.96 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:43:51,427 epoch 5 - iter 576/1445 - loss 0.03709659 - time (sec): 28.71 - samples/sec: 2451.97 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:43:58,280 epoch 5 - iter 720/1445 - loss 0.03495719 - time (sec): 35.57 - samples/sec: 2450.94 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:44:05,351 epoch 5 - iter 864/1445 - loss 0.03454321 - time (sec): 42.64 - samples/sec: 2449.50 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:44:12,501 epoch 5 - iter 1008/1445 - loss 0.03603597 - time (sec): 49.79 - samples/sec: 2433.62 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:44:19,757 epoch 5 - iter 1152/1445 - loss 0.03646270 - time (sec): 57.04 - samples/sec: 2435.72 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:44:27,159 epoch 5 - iter 1296/1445 - loss 0.03650110 - time (sec): 64.45 - samples/sec: 2431.31 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:44:34,782 epoch 5 - iter 1440/1445 - loss 0.03803697 - time (sec): 72.07 - samples/sec: 2437.25 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:44:35,026 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:35,027 EPOCH 5 done: loss 0.0381 - lr: 0.000017
2023-10-17 17:44:38,550 DEV : loss 0.1178504228591919 - f1-score (micro avg) 0.8518
2023-10-17 17:44:38,569 ----------------------------------------------------------------------------------------------------
2023-10-17 17:44:46,150 epoch 6 - iter 144/1445 - loss 0.02532486 - time (sec): 7.58 - samples/sec: 2270.25 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:44:53,390 epoch 6 - iter 288/1445 - loss 0.02904802 - time (sec): 14.82 - samples/sec: 2313.74 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:45:00,617 epoch 6 - iter 432/1445 - loss 0.02918864 - time (sec): 22.05 - samples/sec: 2378.19 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:45:07,572 epoch 6 - iter 576/1445 - loss 0.02552071 - time (sec): 29.00 - samples/sec: 2348.34 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:45:15,347 epoch 6 - iter 720/1445 - loss 0.02625914 - time (sec): 36.78 - samples/sec: 2380.60 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:45:22,880 epoch 6 - iter 864/1445 - loss 0.02735437 - time (sec): 44.31 - samples/sec: 2377.18 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:45:30,013 epoch 6 - iter 1008/1445 - loss 0.02621190 - time (sec): 51.44 - samples/sec: 2380.53 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:45:37,375 epoch 6 - iter 1152/1445 - loss 0.02598037 - time (sec): 58.80 - samples/sec: 2407.78 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:45:44,521 epoch 6 - iter 1296/1445 - loss 0.02707724 - time (sec): 65.95 - samples/sec: 2399.92 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:45:51,591 epoch 6 - iter 1440/1445 - loss 0.02796616 - time (sec): 73.02 - samples/sec: 2407.42 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:45:51,807 ----------------------------------------------------------------------------------------------------
2023-10-17 17:45:51,807 EPOCH 6 done: loss 0.0280 - lr: 0.000013
2023-10-17 17:45:55,246 DEV : loss 0.12150729447603226 - f1-score (micro avg) 0.8515
2023-10-17 17:45:55,264 ----------------------------------------------------------------------------------------------------
2023-10-17 17:46:02,158 epoch 7 - iter 144/1445 - loss 0.01685010 - time (sec): 6.89 - samples/sec: 2430.06 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:46:09,209 epoch 7 - iter 288/1445 - loss 0.01264635 - time (sec): 13.94 - samples/sec: 2462.29 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:46:16,421 epoch 7 - iter 432/1445 - loss 0.01653909 - time (sec): 21.16 - samples/sec: 2502.88 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:46:23,521 epoch 7 - iter 576/1445 - loss 0.01929903 - time (sec): 28.26 - samples/sec: 2487.16 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:46:30,784 epoch 7 - iter 720/1445 - loss 0.01985335 - time (sec): 35.52 - samples/sec: 2493.20 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:46:38,378 epoch 7 - iter 864/1445 - loss 0.02389990 - time (sec): 43.11 - samples/sec: 2459.10 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:46:45,985 epoch 7 - iter 1008/1445 - loss 0.02333411 - time (sec): 50.72 - samples/sec: 2462.85 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:46:53,205 epoch 7 - iter 1152/1445 - loss 0.02230595 - time (sec): 57.94 - samples/sec: 2440.83 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:47:00,241 epoch 7 - iter 1296/1445 - loss 0.02208829 - time (sec): 64.98 - samples/sec: 2444.66 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:47:07,359 epoch 7 - iter 1440/1445 - loss 0.02176833 - time (sec): 72.09 - samples/sec: 2437.16 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:47:07,585 ----------------------------------------------------------------------------------------------------
2023-10-17 17:47:07,586 EPOCH 7 done: loss 0.0219 - lr: 0.000010
2023-10-17 17:47:10,966 DEV : loss 0.12723445892333984 - f1-score (micro avg) 0.8684
2023-10-17 17:47:10,985 saving best model
2023-10-17 17:47:11,514 ----------------------------------------------------------------------------------------------------
2023-10-17 17:47:18,708 epoch 8 - iter 144/1445 - loss 0.01150973 - time (sec): 7.19 - samples/sec: 2544.39 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:47:25,677 epoch 8 - iter 288/1445 - loss 0.01301164 - time (sec): 14.15 - samples/sec: 2555.66 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:47:32,800 epoch 8 - iter 432/1445 - loss 0.01277090 - time (sec): 21.28 - samples/sec: 2514.23 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:47:39,780 epoch 8 - iter 576/1445 - loss 0.01323955 - time (sec): 28.26 - samples/sec: 2479.48 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:47:46,861 epoch 8 - iter 720/1445 - loss 0.01367934 - time (sec): 35.34 - samples/sec: 2505.38 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:47:53,573 epoch 8 - iter 864/1445 - loss 0.01302397 - time (sec): 42.05 - samples/sec: 2521.56 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:48:00,449 epoch 8 - iter 1008/1445 - loss 0.01325247 - time (sec): 48.93 - samples/sec: 2499.40 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:48:07,708 epoch 8 - iter 1152/1445 - loss 0.01320999 - time (sec): 56.19 - samples/sec: 2489.07 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:48:15,148 epoch 8 - iter 1296/1445 - loss 0.01416894 - time (sec): 63.63 - samples/sec: 2487.61 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:48:22,372 epoch 8 - iter 1440/1445 - loss 0.01469288 - time (sec): 70.85 - samples/sec: 2481.73 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:48:22,622 ----------------------------------------------------------------------------------------------------
2023-10-17 17:48:22,623 EPOCH 8 done: loss 0.0147 - lr: 0.000007
2023-10-17 17:48:25,928 DEV : loss 0.13977086544036865 - f1-score (micro avg) 0.8641
2023-10-17 17:48:25,946 ----------------------------------------------------------------------------------------------------
2023-10-17 17:48:33,142 epoch 9 - iter 144/1445 - loss 0.00822659 - time (sec): 7.19 - samples/sec: 2436.78 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:48:40,157 epoch 9 - iter 288/1445 - loss 0.00559292 - time (sec): 14.21 - samples/sec: 2471.90 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:48:47,291 epoch 9 - iter 432/1445 - loss 0.00857791 - time (sec): 21.34 - samples/sec: 2500.95 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:48:54,475 epoch 9 - iter 576/1445 - loss 0.01013173 - time (sec): 28.53 - samples/sec: 2500.07 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:49:01,473 epoch 9 - iter 720/1445 - loss 0.00984675 - time (sec): 35.53 - samples/sec: 2469.71 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:49:08,496 epoch 9 - iter 864/1445 - loss 0.00969007 - time (sec): 42.55 - samples/sec: 2481.24 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:49:15,662 epoch 9 - iter 1008/1445 - loss 0.00977286 - time (sec): 49.71 - samples/sec: 2478.89 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:49:24,103 epoch 9 - iter 1152/1445 - loss 0.01047676 - time (sec): 58.16 - samples/sec: 2431.23 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:49:31,147 epoch 9 - iter 1296/1445 - loss 0.00999327 - time (sec): 65.20 - samples/sec: 2423.11 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:49:38,579 epoch 9 - iter 1440/1445 - loss 0.00998262 - time (sec): 72.63 - samples/sec: 2418.14 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:49:38,821 ----------------------------------------------------------------------------------------------------
2023-10-17 17:49:38,821 EPOCH 9 done: loss 0.0100 - lr: 0.000003
2023-10-17 17:49:42,238 DEV : loss 0.14191032946109772 - f1-score (micro avg) 0.8661
2023-10-17 17:49:42,256 ----------------------------------------------------------------------------------------------------
2023-10-17 17:49:49,582 epoch 10 - iter 144/1445 - loss 0.00635513 - time (sec): 7.33 - samples/sec: 2531.30 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:49:56,970 epoch 10 - iter 288/1445 - loss 0.00533428 - time (sec): 14.71 - samples/sec: 2412.38 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:50:04,167 epoch 10 - iter 432/1445 - loss 0.00667599 - time (sec): 21.91 - samples/sec: 2408.36 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:50:11,207 epoch 10 - iter 576/1445 - loss 0.00622294 - time (sec): 28.95 - samples/sec: 2412.67 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:50:18,623 epoch 10 - iter 720/1445 - loss 0.00658126 - time (sec): 36.37 - samples/sec: 2430.04 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:50:25,770 epoch 10 - iter 864/1445 - loss 0.00769149 - time (sec): 43.51 - samples/sec: 2457.20 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:50:32,601 epoch 10 - iter 1008/1445 - loss 0.00793393 - time (sec): 50.34 - samples/sec: 2466.53 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:50:39,315 epoch 10 - iter 1152/1445 - loss 0.00724425 - time (sec): 57.06 - samples/sec: 2475.80 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:50:46,386 epoch 10 - iter 1296/1445 - loss 0.00766193 - time (sec): 64.13 - samples/sec: 2487.64 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:50:53,224 epoch 10 - iter 1440/1445 - loss 0.00738923 - time (sec): 70.97 - samples/sec: 2472.81 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:50:53,477 ----------------------------------------------------------------------------------------------------
2023-10-17 17:50:53,477 EPOCH 10 done: loss 0.0074 - lr: 0.000000
2023-10-17 17:50:56,778 DEV : loss 0.14468367397785187 - f1-score (micro avg) 0.8679
2023-10-17 17:50:57,176 ----------------------------------------------------------------------------------------------------
2023-10-17 17:50:57,177 Loading model from best epoch ...
2023-10-17 17:50:58,557 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 17:51:01,444
Results:
- F-score (micro) 0.8493
- F-score (macro) 0.7512
- Accuracy 0.7473
By class:
precision recall f1-score support
PER 0.8596 0.8382 0.8487 482
LOC 0.9236 0.8712 0.8966 458
ORG 0.5849 0.4493 0.5082 69
micro avg 0.8733 0.8266 0.8493 1009
macro avg 0.7894 0.7195 0.7512 1009
weighted avg 0.8699 0.8266 0.8472 1009
2023-10-17 17:51:01,445 ----------------------------------------------------------------------------------------------------