stefan-it's picture
Upload folder using huggingface_hub
7339dcb
2023-10-17 17:30:36,893 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,894 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 17:30:36,894 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,894 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-17 17:30:36,894 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,894 Train: 1166 sentences
2023-10-17 17:30:36,894 (train_with_dev=False, train_with_test=False)
2023-10-17 17:30:36,894 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,894 Training Params:
2023-10-17 17:30:36,894 - learning_rate: "5e-05"
2023-10-17 17:30:36,894 - mini_batch_size: "4"
2023-10-17 17:30:36,895 - max_epochs: "10"
2023-10-17 17:30:36,895 - shuffle: "True"
2023-10-17 17:30:36,895 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,895 Plugins:
2023-10-17 17:30:36,895 - TensorboardLogger
2023-10-17 17:30:36,895 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 17:30:36,895 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,895 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 17:30:36,895 - metric: "('micro avg', 'f1-score')"
2023-10-17 17:30:36,895 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,895 Computation:
2023-10-17 17:30:36,895 - compute on device: cuda:0
2023-10-17 17:30:36,895 - embedding storage: none
2023-10-17 17:30:36,895 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,895 Model training base path: "hmbench-newseye/fi-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-17 17:30:36,895 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,895 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:36,895 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 17:30:38,477 epoch 1 - iter 29/292 - loss 3.30003254 - time (sec): 1.58 - samples/sec: 2413.28 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:30:40,085 epoch 1 - iter 58/292 - loss 2.46491688 - time (sec): 3.19 - samples/sec: 2550.87 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:30:42,223 epoch 1 - iter 87/292 - loss 1.78620105 - time (sec): 5.33 - samples/sec: 2612.08 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:30:43,795 epoch 1 - iter 116/292 - loss 1.44223007 - time (sec): 6.90 - samples/sec: 2692.60 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:30:45,319 epoch 1 - iter 145/292 - loss 1.26681101 - time (sec): 8.42 - samples/sec: 2676.66 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:30:46,932 epoch 1 - iter 174/292 - loss 1.11965052 - time (sec): 10.04 - samples/sec: 2657.49 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:30:48,467 epoch 1 - iter 203/292 - loss 1.00652496 - time (sec): 11.57 - samples/sec: 2649.79 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:30:50,183 epoch 1 - iter 232/292 - loss 0.90576718 - time (sec): 13.29 - samples/sec: 2675.23 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:30:51,677 epoch 1 - iter 261/292 - loss 0.84363510 - time (sec): 14.78 - samples/sec: 2669.12 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:30:53,437 epoch 1 - iter 290/292 - loss 0.77909776 - time (sec): 16.54 - samples/sec: 2670.90 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:30:53,535 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:53,536 EPOCH 1 done: loss 0.7757 - lr: 0.000049
2023-10-17 17:30:54,603 DEV : loss 0.17767061293125153 - f1-score (micro avg) 0.5031
2023-10-17 17:30:54,608 saving best model
2023-10-17 17:30:54,980 ----------------------------------------------------------------------------------------------------
2023-10-17 17:30:56,734 epoch 2 - iter 29/292 - loss 0.21655750 - time (sec): 1.75 - samples/sec: 2628.88 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:30:58,381 epoch 2 - iter 58/292 - loss 0.18535360 - time (sec): 3.40 - samples/sec: 2571.11 - lr: 0.000049 - momentum: 0.000000
2023-10-17 17:31:00,019 epoch 2 - iter 87/292 - loss 0.17442449 - time (sec): 5.04 - samples/sec: 2614.31 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:31:01,946 epoch 2 - iter 116/292 - loss 0.17059919 - time (sec): 6.96 - samples/sec: 2627.44 - lr: 0.000048 - momentum: 0.000000
2023-10-17 17:31:03,656 epoch 2 - iter 145/292 - loss 0.17086627 - time (sec): 8.67 - samples/sec: 2655.51 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:31:05,276 epoch 2 - iter 174/292 - loss 0.17566071 - time (sec): 10.29 - samples/sec: 2678.53 - lr: 0.000047 - momentum: 0.000000
2023-10-17 17:31:06,827 epoch 2 - iter 203/292 - loss 0.18058009 - time (sec): 11.85 - samples/sec: 2649.44 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:31:08,397 epoch 2 - iter 232/292 - loss 0.19124050 - time (sec): 13.42 - samples/sec: 2617.53 - lr: 0.000046 - momentum: 0.000000
2023-10-17 17:31:10,207 epoch 2 - iter 261/292 - loss 0.18814577 - time (sec): 15.22 - samples/sec: 2640.87 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:31:11,880 epoch 2 - iter 290/292 - loss 0.18197479 - time (sec): 16.90 - samples/sec: 2619.81 - lr: 0.000045 - momentum: 0.000000
2023-10-17 17:31:11,977 ----------------------------------------------------------------------------------------------------
2023-10-17 17:31:11,977 EPOCH 2 done: loss 0.1824 - lr: 0.000045
2023-10-17 17:31:13,308 DEV : loss 0.14048103988170624 - f1-score (micro avg) 0.6466
2023-10-17 17:31:13,313 saving best model
2023-10-17 17:31:13,804 ----------------------------------------------------------------------------------------------------
2023-10-17 17:31:15,639 epoch 3 - iter 29/292 - loss 0.11981676 - time (sec): 1.83 - samples/sec: 2485.88 - lr: 0.000044 - momentum: 0.000000
2023-10-17 17:31:17,273 epoch 3 - iter 58/292 - loss 0.10010235 - time (sec): 3.46 - samples/sec: 2594.59 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:31:18,927 epoch 3 - iter 87/292 - loss 0.12124707 - time (sec): 5.12 - samples/sec: 2673.04 - lr: 0.000043 - momentum: 0.000000
2023-10-17 17:31:20,473 epoch 3 - iter 116/292 - loss 0.11456222 - time (sec): 6.67 - samples/sec: 2652.45 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:31:22,123 epoch 3 - iter 145/292 - loss 0.11612834 - time (sec): 8.32 - samples/sec: 2626.07 - lr: 0.000042 - momentum: 0.000000
2023-10-17 17:31:23,684 epoch 3 - iter 174/292 - loss 0.10891041 - time (sec): 9.88 - samples/sec: 2616.21 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:31:25,443 epoch 3 - iter 203/292 - loss 0.10851194 - time (sec): 11.64 - samples/sec: 2659.50 - lr: 0.000041 - momentum: 0.000000
2023-10-17 17:31:27,180 epoch 3 - iter 232/292 - loss 0.10532314 - time (sec): 13.37 - samples/sec: 2661.79 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:31:28,766 epoch 3 - iter 261/292 - loss 0.10435454 - time (sec): 14.96 - samples/sec: 2649.95 - lr: 0.000040 - momentum: 0.000000
2023-10-17 17:31:30,574 epoch 3 - iter 290/292 - loss 0.10430299 - time (sec): 16.77 - samples/sec: 2632.39 - lr: 0.000039 - momentum: 0.000000
2023-10-17 17:31:30,687 ----------------------------------------------------------------------------------------------------
2023-10-17 17:31:30,687 EPOCH 3 done: loss 0.1054 - lr: 0.000039
2023-10-17 17:31:31,979 DEV : loss 0.14705337584018707 - f1-score (micro avg) 0.7097
2023-10-17 17:31:31,985 saving best model
2023-10-17 17:31:32,456 ----------------------------------------------------------------------------------------------------
2023-10-17 17:31:33,879 epoch 4 - iter 29/292 - loss 0.08724733 - time (sec): 1.42 - samples/sec: 2505.49 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:31:35,458 epoch 4 - iter 58/292 - loss 0.06675805 - time (sec): 3.00 - samples/sec: 2619.19 - lr: 0.000038 - momentum: 0.000000
2023-10-17 17:31:37,297 epoch 4 - iter 87/292 - loss 0.06063724 - time (sec): 4.84 - samples/sec: 2651.68 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:31:39,062 epoch 4 - iter 116/292 - loss 0.05953907 - time (sec): 6.60 - samples/sec: 2643.44 - lr: 0.000037 - momentum: 0.000000
2023-10-17 17:31:40,676 epoch 4 - iter 145/292 - loss 0.05856854 - time (sec): 8.22 - samples/sec: 2650.73 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:31:42,711 epoch 4 - iter 174/292 - loss 0.06007165 - time (sec): 10.25 - samples/sec: 2603.20 - lr: 0.000036 - momentum: 0.000000
2023-10-17 17:31:44,539 epoch 4 - iter 203/292 - loss 0.06473355 - time (sec): 12.08 - samples/sec: 2603.57 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:31:46,265 epoch 4 - iter 232/292 - loss 0.06599594 - time (sec): 13.81 - samples/sec: 2586.77 - lr: 0.000035 - momentum: 0.000000
2023-10-17 17:31:47,862 epoch 4 - iter 261/292 - loss 0.06883779 - time (sec): 15.40 - samples/sec: 2608.51 - lr: 0.000034 - momentum: 0.000000
2023-10-17 17:31:49,518 epoch 4 - iter 290/292 - loss 0.06894707 - time (sec): 17.06 - samples/sec: 2589.38 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:31:49,618 ----------------------------------------------------------------------------------------------------
2023-10-17 17:31:49,618 EPOCH 4 done: loss 0.0689 - lr: 0.000033
2023-10-17 17:31:50,909 DEV : loss 0.1414823830127716 - f1-score (micro avg) 0.7249
2023-10-17 17:31:50,915 saving best model
2023-10-17 17:31:51,410 ----------------------------------------------------------------------------------------------------
2023-10-17 17:31:53,319 epoch 5 - iter 29/292 - loss 0.06241394 - time (sec): 1.91 - samples/sec: 2724.21 - lr: 0.000033 - momentum: 0.000000
2023-10-17 17:31:55,075 epoch 5 - iter 58/292 - loss 0.04367683 - time (sec): 3.66 - samples/sec: 2698.48 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:31:56,641 epoch 5 - iter 87/292 - loss 0.04656726 - time (sec): 5.23 - samples/sec: 2727.79 - lr: 0.000032 - momentum: 0.000000
2023-10-17 17:31:58,099 epoch 5 - iter 116/292 - loss 0.05088141 - time (sec): 6.69 - samples/sec: 2702.78 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:31:59,745 epoch 5 - iter 145/292 - loss 0.05467026 - time (sec): 8.33 - samples/sec: 2695.21 - lr: 0.000031 - momentum: 0.000000
2023-10-17 17:32:01,457 epoch 5 - iter 174/292 - loss 0.05364881 - time (sec): 10.04 - samples/sec: 2717.40 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:32:03,097 epoch 5 - iter 203/292 - loss 0.05135010 - time (sec): 11.68 - samples/sec: 2717.29 - lr: 0.000030 - momentum: 0.000000
2023-10-17 17:32:04,745 epoch 5 - iter 232/292 - loss 0.05250977 - time (sec): 13.33 - samples/sec: 2711.88 - lr: 0.000029 - momentum: 0.000000
2023-10-17 17:32:06,387 epoch 5 - iter 261/292 - loss 0.05110472 - time (sec): 14.97 - samples/sec: 2669.84 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:32:07,968 epoch 5 - iter 290/292 - loss 0.04973646 - time (sec): 16.55 - samples/sec: 2664.60 - lr: 0.000028 - momentum: 0.000000
2023-10-17 17:32:08,086 ----------------------------------------------------------------------------------------------------
2023-10-17 17:32:08,086 EPOCH 5 done: loss 0.0494 - lr: 0.000028
2023-10-17 17:32:09,439 DEV : loss 0.14127209782600403 - f1-score (micro avg) 0.7167
2023-10-17 17:32:09,447 ----------------------------------------------------------------------------------------------------
2023-10-17 17:32:11,036 epoch 6 - iter 29/292 - loss 0.04625909 - time (sec): 1.59 - samples/sec: 2819.48 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:32:12,510 epoch 6 - iter 58/292 - loss 0.04056874 - time (sec): 3.06 - samples/sec: 2643.07 - lr: 0.000027 - momentum: 0.000000
2023-10-17 17:32:14,343 epoch 6 - iter 87/292 - loss 0.03655091 - time (sec): 4.89 - samples/sec: 2634.06 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:32:15,975 epoch 6 - iter 116/292 - loss 0.03649962 - time (sec): 6.53 - samples/sec: 2633.52 - lr: 0.000026 - momentum: 0.000000
2023-10-17 17:32:17,546 epoch 6 - iter 145/292 - loss 0.03544033 - time (sec): 8.10 - samples/sec: 2558.87 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:32:19,276 epoch 6 - iter 174/292 - loss 0.03744064 - time (sec): 9.83 - samples/sec: 2544.08 - lr: 0.000025 - momentum: 0.000000
2023-10-17 17:32:21,052 epoch 6 - iter 203/292 - loss 0.03723482 - time (sec): 11.60 - samples/sec: 2562.11 - lr: 0.000024 - momentum: 0.000000
2023-10-17 17:32:22,754 epoch 6 - iter 232/292 - loss 0.03746420 - time (sec): 13.30 - samples/sec: 2586.93 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:32:24,378 epoch 6 - iter 261/292 - loss 0.03589379 - time (sec): 14.93 - samples/sec: 2595.57 - lr: 0.000023 - momentum: 0.000000
2023-10-17 17:32:26,153 epoch 6 - iter 290/292 - loss 0.03421415 - time (sec): 16.70 - samples/sec: 2650.81 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:32:26,246 ----------------------------------------------------------------------------------------------------
2023-10-17 17:32:26,246 EPOCH 6 done: loss 0.0346 - lr: 0.000022
2023-10-17 17:32:27,563 DEV : loss 0.18597932159900665 - f1-score (micro avg) 0.7516
2023-10-17 17:32:27,586 saving best model
2023-10-17 17:32:28,046 ----------------------------------------------------------------------------------------------------
2023-10-17 17:32:29,834 epoch 7 - iter 29/292 - loss 0.00818199 - time (sec): 1.78 - samples/sec: 2760.52 - lr: 0.000022 - momentum: 0.000000
2023-10-17 17:32:31,568 epoch 7 - iter 58/292 - loss 0.02426964 - time (sec): 3.52 - samples/sec: 2614.31 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:32:33,249 epoch 7 - iter 87/292 - loss 0.02134647 - time (sec): 5.20 - samples/sec: 2662.96 - lr: 0.000021 - momentum: 0.000000
2023-10-17 17:32:35,089 epoch 7 - iter 116/292 - loss 0.01965762 - time (sec): 7.04 - samples/sec: 2649.10 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:32:36,865 epoch 7 - iter 145/292 - loss 0.01930732 - time (sec): 8.81 - samples/sec: 2675.36 - lr: 0.000020 - momentum: 0.000000
2023-10-17 17:32:38,478 epoch 7 - iter 174/292 - loss 0.02383070 - time (sec): 10.43 - samples/sec: 2670.87 - lr: 0.000019 - momentum: 0.000000
2023-10-17 17:32:40,141 epoch 7 - iter 203/292 - loss 0.02392436 - time (sec): 12.09 - samples/sec: 2694.46 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:32:41,705 epoch 7 - iter 232/292 - loss 0.02288270 - time (sec): 13.65 - samples/sec: 2691.62 - lr: 0.000018 - momentum: 0.000000
2023-10-17 17:32:43,429 epoch 7 - iter 261/292 - loss 0.02486344 - time (sec): 15.38 - samples/sec: 2630.28 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:32:44,992 epoch 7 - iter 290/292 - loss 0.02465437 - time (sec): 16.94 - samples/sec: 2613.50 - lr: 0.000017 - momentum: 0.000000
2023-10-17 17:32:45,090 ----------------------------------------------------------------------------------------------------
2023-10-17 17:32:45,090 EPOCH 7 done: loss 0.0247 - lr: 0.000017
2023-10-17 17:32:46,390 DEV : loss 0.18214021623134613 - f1-score (micro avg) 0.745
2023-10-17 17:32:46,398 ----------------------------------------------------------------------------------------------------
2023-10-17 17:32:48,160 epoch 8 - iter 29/292 - loss 0.01468803 - time (sec): 1.76 - samples/sec: 2688.71 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:32:49,913 epoch 8 - iter 58/292 - loss 0.02488023 - time (sec): 3.51 - samples/sec: 2680.57 - lr: 0.000016 - momentum: 0.000000
2023-10-17 17:32:51,722 epoch 8 - iter 87/292 - loss 0.02302285 - time (sec): 5.32 - samples/sec: 2676.97 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:32:53,284 epoch 8 - iter 116/292 - loss 0.02543944 - time (sec): 6.88 - samples/sec: 2644.23 - lr: 0.000015 - momentum: 0.000000
2023-10-17 17:32:54,913 epoch 8 - iter 145/292 - loss 0.02236377 - time (sec): 8.51 - samples/sec: 2591.58 - lr: 0.000014 - momentum: 0.000000
2023-10-17 17:32:56,481 epoch 8 - iter 174/292 - loss 0.02166988 - time (sec): 10.08 - samples/sec: 2560.01 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:32:58,182 epoch 8 - iter 203/292 - loss 0.02064423 - time (sec): 11.78 - samples/sec: 2571.56 - lr: 0.000013 - momentum: 0.000000
2023-10-17 17:32:59,794 epoch 8 - iter 232/292 - loss 0.01874188 - time (sec): 13.39 - samples/sec: 2563.73 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:33:01,784 epoch 8 - iter 261/292 - loss 0.01891195 - time (sec): 15.38 - samples/sec: 2622.51 - lr: 0.000012 - momentum: 0.000000
2023-10-17 17:33:03,287 epoch 8 - iter 290/292 - loss 0.01868060 - time (sec): 16.89 - samples/sec: 2626.97 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:33:03,377 ----------------------------------------------------------------------------------------------------
2023-10-17 17:33:03,377 EPOCH 8 done: loss 0.0187 - lr: 0.000011
2023-10-17 17:33:04,692 DEV : loss 0.19712962210178375 - f1-score (micro avg) 0.7385
2023-10-17 17:33:04,697 ----------------------------------------------------------------------------------------------------
2023-10-17 17:33:06,387 epoch 9 - iter 29/292 - loss 0.01245605 - time (sec): 1.69 - samples/sec: 2525.20 - lr: 0.000011 - momentum: 0.000000
2023-10-17 17:33:07,995 epoch 9 - iter 58/292 - loss 0.01509373 - time (sec): 3.30 - samples/sec: 2471.69 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:33:09,564 epoch 9 - iter 87/292 - loss 0.01256178 - time (sec): 4.87 - samples/sec: 2439.19 - lr: 0.000010 - momentum: 0.000000
2023-10-17 17:33:11,182 epoch 9 - iter 116/292 - loss 0.01064407 - time (sec): 6.48 - samples/sec: 2460.99 - lr: 0.000009 - momentum: 0.000000
2023-10-17 17:33:12,919 epoch 9 - iter 145/292 - loss 0.00982040 - time (sec): 8.22 - samples/sec: 2531.73 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:33:14,624 epoch 9 - iter 174/292 - loss 0.01005871 - time (sec): 9.93 - samples/sec: 2563.45 - lr: 0.000008 - momentum: 0.000000
2023-10-17 17:33:16,259 epoch 9 - iter 203/292 - loss 0.00902004 - time (sec): 11.56 - samples/sec: 2562.04 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:33:18,149 epoch 9 - iter 232/292 - loss 0.00850666 - time (sec): 13.45 - samples/sec: 2571.37 - lr: 0.000007 - momentum: 0.000000
2023-10-17 17:33:19,863 epoch 9 - iter 261/292 - loss 0.00998313 - time (sec): 15.17 - samples/sec: 2599.17 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:33:21,594 epoch 9 - iter 290/292 - loss 0.01091769 - time (sec): 16.90 - samples/sec: 2601.11 - lr: 0.000006 - momentum: 0.000000
2023-10-17 17:33:21,787 ----------------------------------------------------------------------------------------------------
2023-10-17 17:33:21,788 EPOCH 9 done: loss 0.0109 - lr: 0.000006
2023-10-17 17:33:23,047 DEV : loss 0.2052023559808731 - f1-score (micro avg) 0.7632
2023-10-17 17:33:23,053 saving best model
2023-10-17 17:33:23,537 ----------------------------------------------------------------------------------------------------
2023-10-17 17:33:25,290 epoch 10 - iter 29/292 - loss 0.01432242 - time (sec): 1.75 - samples/sec: 2794.01 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:33:26,991 epoch 10 - iter 58/292 - loss 0.01708261 - time (sec): 3.45 - samples/sec: 2798.02 - lr: 0.000005 - momentum: 0.000000
2023-10-17 17:33:28,628 epoch 10 - iter 87/292 - loss 0.01497146 - time (sec): 5.09 - samples/sec: 2727.39 - lr: 0.000004 - momentum: 0.000000
2023-10-17 17:33:30,287 epoch 10 - iter 116/292 - loss 0.01184976 - time (sec): 6.75 - samples/sec: 2682.04 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:33:31,906 epoch 10 - iter 145/292 - loss 0.00956243 - time (sec): 8.37 - samples/sec: 2689.98 - lr: 0.000003 - momentum: 0.000000
2023-10-17 17:33:33,596 epoch 10 - iter 174/292 - loss 0.00983371 - time (sec): 10.06 - samples/sec: 2667.52 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:33:35,413 epoch 10 - iter 203/292 - loss 0.01046263 - time (sec): 11.87 - samples/sec: 2678.36 - lr: 0.000002 - momentum: 0.000000
2023-10-17 17:33:37,022 epoch 10 - iter 232/292 - loss 0.01126760 - time (sec): 13.48 - samples/sec: 2647.13 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:33:38,797 epoch 10 - iter 261/292 - loss 0.01062140 - time (sec): 15.26 - samples/sec: 2661.98 - lr: 0.000001 - momentum: 0.000000
2023-10-17 17:33:40,353 epoch 10 - iter 290/292 - loss 0.00996671 - time (sec): 16.81 - samples/sec: 2632.41 - lr: 0.000000 - momentum: 0.000000
2023-10-17 17:33:40,459 ----------------------------------------------------------------------------------------------------
2023-10-17 17:33:40,459 EPOCH 10 done: loss 0.0099 - lr: 0.000000
2023-10-17 17:33:41,754 DEV : loss 0.19731809198856354 - f1-score (micro avg) 0.7549
2023-10-17 17:33:42,117 ----------------------------------------------------------------------------------------------------
2023-10-17 17:33:42,119 Loading model from best epoch ...
2023-10-17 17:33:43,724 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-17 17:33:46,279
Results:
- F-score (micro) 0.7605
- F-score (macro) 0.6957
- Accuracy 0.6295
By class:
precision recall f1-score support
PER 0.8244 0.8362 0.8302 348
LOC 0.6387 0.8467 0.7282 261
ORG 0.4902 0.4808 0.4854 52
HumanProd 0.7083 0.7727 0.7391 22
micro avg 0.7158 0.8111 0.7605 683
macro avg 0.6654 0.7341 0.6957 683
weighted avg 0.7242 0.8111 0.7621 683
2023-10-17 17:33:46,279 ----------------------------------------------------------------------------------------------------