stefan-it's picture
Upload folder using huggingface_hub
77c4f8a
2023-10-16 18:47:30,788 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:30,789 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 18:47:30,789 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:30,789 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:47:30,789 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:30,789 Train: 1166 sentences
2023-10-16 18:47:30,789 (train_with_dev=False, train_with_test=False)
2023-10-16 18:47:30,789 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:30,789 Training Params:
2023-10-16 18:47:30,789 - learning_rate: "3e-05"
2023-10-16 18:47:30,789 - mini_batch_size: "4"
2023-10-16 18:47:30,789 - max_epochs: "10"
2023-10-16 18:47:30,789 - shuffle: "True"
2023-10-16 18:47:30,789 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:30,789 Plugins:
2023-10-16 18:47:30,789 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:47:30,789 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:30,789 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:47:30,790 - metric: "('micro avg', 'f1-score')"
2023-10-16 18:47:30,790 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:30,790 Computation:
2023-10-16 18:47:30,790 - compute on device: cuda:0
2023-10-16 18:47:30,790 - embedding storage: none
2023-10-16 18:47:30,790 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:30,790 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
2023-10-16 18:47:30,790 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:30,790 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:32,650 epoch 1 - iter 29/292 - loss 2.82911046 - time (sec): 1.86 - samples/sec: 2422.25 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:47:34,389 epoch 1 - iter 58/292 - loss 2.49086620 - time (sec): 3.60 - samples/sec: 2514.87 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:47:36,229 epoch 1 - iter 87/292 - loss 1.84331573 - time (sec): 5.44 - samples/sec: 2561.21 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:47:37,911 epoch 1 - iter 116/292 - loss 1.51218308 - time (sec): 7.12 - samples/sec: 2586.19 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:47:39,496 epoch 1 - iter 145/292 - loss 1.36167935 - time (sec): 8.70 - samples/sec: 2538.62 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:47:41,263 epoch 1 - iter 174/292 - loss 1.17652481 - time (sec): 10.47 - samples/sec: 2577.89 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:47:42,961 epoch 1 - iter 203/292 - loss 1.06327340 - time (sec): 12.17 - samples/sec: 2578.85 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:47:44,670 epoch 1 - iter 232/292 - loss 0.97664361 - time (sec): 13.88 - samples/sec: 2550.83 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:47:46,367 epoch 1 - iter 261/292 - loss 0.90519041 - time (sec): 15.58 - samples/sec: 2527.96 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:47:48,221 epoch 1 - iter 290/292 - loss 0.84527451 - time (sec): 17.43 - samples/sec: 2542.47 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:47:48,318 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:48,318 EPOCH 1 done: loss 0.8436 - lr: 0.000030
2023-10-16 18:47:49,558 DEV : loss 0.21602800488471985 - f1-score (micro avg) 0.337
2023-10-16 18:47:49,563 saving best model
2023-10-16 18:47:50,004 ----------------------------------------------------------------------------------------------------
2023-10-16 18:47:51,866 epoch 2 - iter 29/292 - loss 0.24466333 - time (sec): 1.86 - samples/sec: 2815.32 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:47:53,581 epoch 2 - iter 58/292 - loss 0.20933144 - time (sec): 3.58 - samples/sec: 2721.81 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:47:55,438 epoch 2 - iter 87/292 - loss 0.21419564 - time (sec): 5.43 - samples/sec: 2543.98 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:47:57,300 epoch 2 - iter 116/292 - loss 0.22692878 - time (sec): 7.29 - samples/sec: 2538.49 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:47:59,284 epoch 2 - iter 145/292 - loss 0.23462881 - time (sec): 9.28 - samples/sec: 2553.12 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:48:01,011 epoch 2 - iter 174/292 - loss 0.22426306 - time (sec): 11.01 - samples/sec: 2592.85 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:48:02,442 epoch 2 - iter 203/292 - loss 0.22508575 - time (sec): 12.44 - samples/sec: 2565.16 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:48:03,851 epoch 2 - iter 232/292 - loss 0.22081040 - time (sec): 13.85 - samples/sec: 2543.91 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:48:05,712 epoch 2 - iter 261/292 - loss 0.21204480 - time (sec): 15.71 - samples/sec: 2564.40 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:48:07,375 epoch 2 - iter 290/292 - loss 0.20672562 - time (sec): 17.37 - samples/sec: 2551.57 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:48:07,458 ----------------------------------------------------------------------------------------------------
2023-10-16 18:48:07,458 EPOCH 2 done: loss 0.2062 - lr: 0.000027
2023-10-16 18:48:08,760 DEV : loss 0.15461082756519318 - f1-score (micro avg) 0.5798
2023-10-16 18:48:08,766 saving best model
2023-10-16 18:48:09,304 ----------------------------------------------------------------------------------------------------
2023-10-16 18:48:10,958 epoch 3 - iter 29/292 - loss 0.14468212 - time (sec): 1.65 - samples/sec: 2577.11 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:48:12,664 epoch 3 - iter 58/292 - loss 0.14204837 - time (sec): 3.36 - samples/sec: 2634.33 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:48:14,378 epoch 3 - iter 87/292 - loss 0.12917051 - time (sec): 5.07 - samples/sec: 2582.04 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:48:16,152 epoch 3 - iter 116/292 - loss 0.12946334 - time (sec): 6.85 - samples/sec: 2590.11 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:48:17,863 epoch 3 - iter 145/292 - loss 0.12001689 - time (sec): 8.56 - samples/sec: 2620.61 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:48:19,560 epoch 3 - iter 174/292 - loss 0.11169965 - time (sec): 10.25 - samples/sec: 2630.65 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:48:21,157 epoch 3 - iter 203/292 - loss 0.11960085 - time (sec): 11.85 - samples/sec: 2601.45 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:48:22,949 epoch 3 - iter 232/292 - loss 0.11698755 - time (sec): 13.64 - samples/sec: 2589.10 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:48:24,708 epoch 3 - iter 261/292 - loss 0.11551965 - time (sec): 15.40 - samples/sec: 2562.20 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:48:26,558 epoch 3 - iter 290/292 - loss 0.11480963 - time (sec): 17.25 - samples/sec: 2568.65 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:48:26,657 ----------------------------------------------------------------------------------------------------
2023-10-16 18:48:26,657 EPOCH 3 done: loss 0.1146 - lr: 0.000023
2023-10-16 18:48:27,913 DEV : loss 0.13052009046077728 - f1-score (micro avg) 0.679
2023-10-16 18:48:27,919 saving best model
2023-10-16 18:48:28,417 ----------------------------------------------------------------------------------------------------
2023-10-16 18:48:30,132 epoch 4 - iter 29/292 - loss 0.08537570 - time (sec): 1.71 - samples/sec: 2536.85 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:48:31,767 epoch 4 - iter 58/292 - loss 0.07028676 - time (sec): 3.35 - samples/sec: 2495.01 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:48:33,404 epoch 4 - iter 87/292 - loss 0.07384165 - time (sec): 4.99 - samples/sec: 2550.32 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:48:34,978 epoch 4 - iter 116/292 - loss 0.07106452 - time (sec): 6.56 - samples/sec: 2518.68 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:48:36,765 epoch 4 - iter 145/292 - loss 0.07166389 - time (sec): 8.35 - samples/sec: 2633.37 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:48:38,779 epoch 4 - iter 174/292 - loss 0.07127372 - time (sec): 10.36 - samples/sec: 2532.09 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:48:40,421 epoch 4 - iter 203/292 - loss 0.07109717 - time (sec): 12.00 - samples/sec: 2539.84 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:48:42,238 epoch 4 - iter 232/292 - loss 0.07908090 - time (sec): 13.82 - samples/sec: 2559.91 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:48:44,005 epoch 4 - iter 261/292 - loss 0.07663505 - time (sec): 15.59 - samples/sec: 2542.07 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:48:45,789 epoch 4 - iter 290/292 - loss 0.07642799 - time (sec): 17.37 - samples/sec: 2534.38 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:48:45,903 ----------------------------------------------------------------------------------------------------
2023-10-16 18:48:45,903 EPOCH 4 done: loss 0.0759 - lr: 0.000020
2023-10-16 18:48:47,155 DEV : loss 0.132146954536438 - f1-score (micro avg) 0.6763
2023-10-16 18:48:47,160 ----------------------------------------------------------------------------------------------------
2023-10-16 18:48:48,969 epoch 5 - iter 29/292 - loss 0.04044493 - time (sec): 1.81 - samples/sec: 2428.77 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:48:50,548 epoch 5 - iter 58/292 - loss 0.03579751 - time (sec): 3.39 - samples/sec: 2449.11 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:48:52,506 epoch 5 - iter 87/292 - loss 0.05205437 - time (sec): 5.34 - samples/sec: 2486.93 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:48:54,176 epoch 5 - iter 116/292 - loss 0.04956530 - time (sec): 7.01 - samples/sec: 2520.65 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:48:55,809 epoch 5 - iter 145/292 - loss 0.05364871 - time (sec): 8.65 - samples/sec: 2590.18 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:48:57,397 epoch 5 - iter 174/292 - loss 0.05566681 - time (sec): 10.24 - samples/sec: 2588.97 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:48:59,058 epoch 5 - iter 203/292 - loss 0.05511954 - time (sec): 11.90 - samples/sec: 2581.78 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:49:00,736 epoch 5 - iter 232/292 - loss 0.05166875 - time (sec): 13.57 - samples/sec: 2592.74 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:49:02,332 epoch 5 - iter 261/292 - loss 0.05145842 - time (sec): 15.17 - samples/sec: 2613.29 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:49:04,070 epoch 5 - iter 290/292 - loss 0.05131951 - time (sec): 16.91 - samples/sec: 2622.72 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:49:04,156 ----------------------------------------------------------------------------------------------------
2023-10-16 18:49:04,156 EPOCH 5 done: loss 0.0517 - lr: 0.000017
2023-10-16 18:49:05,439 DEV : loss 0.12989631295204163 - f1-score (micro avg) 0.7468
2023-10-16 18:49:05,445 saving best model
2023-10-16 18:49:05,950 ----------------------------------------------------------------------------------------------------
2023-10-16 18:49:07,718 epoch 6 - iter 29/292 - loss 0.03368459 - time (sec): 1.77 - samples/sec: 2588.74 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:49:09,383 epoch 6 - iter 58/292 - loss 0.03595871 - time (sec): 3.43 - samples/sec: 2661.03 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:49:11,087 epoch 6 - iter 87/292 - loss 0.03750286 - time (sec): 5.13 - samples/sec: 2663.44 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:49:12,734 epoch 6 - iter 116/292 - loss 0.03535109 - time (sec): 6.78 - samples/sec: 2658.92 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:49:14,462 epoch 6 - iter 145/292 - loss 0.03424649 - time (sec): 8.51 - samples/sec: 2679.85 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:49:16,119 epoch 6 - iter 174/292 - loss 0.03583046 - time (sec): 10.17 - samples/sec: 2686.52 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:49:17,673 epoch 6 - iter 203/292 - loss 0.03342373 - time (sec): 11.72 - samples/sec: 2643.28 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:49:19,337 epoch 6 - iter 232/292 - loss 0.03942726 - time (sec): 13.38 - samples/sec: 2666.94 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:49:20,924 epoch 6 - iter 261/292 - loss 0.03835690 - time (sec): 14.97 - samples/sec: 2653.46 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:49:22,612 epoch 6 - iter 290/292 - loss 0.03977398 - time (sec): 16.66 - samples/sec: 2657.84 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:49:22,700 ----------------------------------------------------------------------------------------------------
2023-10-16 18:49:22,700 EPOCH 6 done: loss 0.0397 - lr: 0.000013
2023-10-16 18:49:23,944 DEV : loss 0.14868424832820892 - f1-score (micro avg) 0.7473
2023-10-16 18:49:23,949 saving best model
2023-10-16 18:49:24,501 ----------------------------------------------------------------------------------------------------
2023-10-16 18:49:26,376 epoch 7 - iter 29/292 - loss 0.03302502 - time (sec): 1.87 - samples/sec: 3020.84 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:49:28,000 epoch 7 - iter 58/292 - loss 0.02701595 - time (sec): 3.49 - samples/sec: 2914.66 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:49:29,505 epoch 7 - iter 87/292 - loss 0.02743433 - time (sec): 5.00 - samples/sec: 2830.89 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:49:31,177 epoch 7 - iter 116/292 - loss 0.03183254 - time (sec): 6.67 - samples/sec: 2749.62 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:49:32,784 epoch 7 - iter 145/292 - loss 0.03388558 - time (sec): 8.28 - samples/sec: 2672.20 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:49:34,471 epoch 7 - iter 174/292 - loss 0.03382158 - time (sec): 9.97 - samples/sec: 2690.49 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:49:36,122 epoch 7 - iter 203/292 - loss 0.03367802 - time (sec): 11.62 - samples/sec: 2695.76 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:49:37,702 epoch 7 - iter 232/292 - loss 0.03454636 - time (sec): 13.20 - samples/sec: 2680.29 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:49:39,172 epoch 7 - iter 261/292 - loss 0.03328201 - time (sec): 14.67 - samples/sec: 2692.19 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:49:40,957 epoch 7 - iter 290/292 - loss 0.03164000 - time (sec): 16.45 - samples/sec: 2691.02 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:49:41,053 ----------------------------------------------------------------------------------------------------
2023-10-16 18:49:41,053 EPOCH 7 done: loss 0.0315 - lr: 0.000010
2023-10-16 18:49:42,323 DEV : loss 0.15577590465545654 - f1-score (micro avg) 0.7425
2023-10-16 18:49:42,327 ----------------------------------------------------------------------------------------------------
2023-10-16 18:49:43,926 epoch 8 - iter 29/292 - loss 0.02151028 - time (sec): 1.60 - samples/sec: 2896.40 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:49:45,528 epoch 8 - iter 58/292 - loss 0.01568049 - time (sec): 3.20 - samples/sec: 2836.41 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:49:47,028 epoch 8 - iter 87/292 - loss 0.01534683 - time (sec): 4.70 - samples/sec: 2681.00 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:49:48,699 epoch 8 - iter 116/292 - loss 0.01579191 - time (sec): 6.37 - samples/sec: 2714.41 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:49:50,308 epoch 8 - iter 145/292 - loss 0.01777071 - time (sec): 7.98 - samples/sec: 2713.12 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:49:52,025 epoch 8 - iter 174/292 - loss 0.02052391 - time (sec): 9.70 - samples/sec: 2733.60 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:49:53,814 epoch 8 - iter 203/292 - loss 0.02071842 - time (sec): 11.49 - samples/sec: 2641.29 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:49:55,480 epoch 8 - iter 232/292 - loss 0.02537783 - time (sec): 13.15 - samples/sec: 2639.35 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:49:57,248 epoch 8 - iter 261/292 - loss 0.02774566 - time (sec): 14.92 - samples/sec: 2665.19 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:49:58,766 epoch 8 - iter 290/292 - loss 0.02654013 - time (sec): 16.44 - samples/sec: 2690.86 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:49:58,850 ----------------------------------------------------------------------------------------------------
2023-10-16 18:49:58,850 EPOCH 8 done: loss 0.0264 - lr: 0.000007
2023-10-16 18:50:00,128 DEV : loss 0.15800637006759644 - f1-score (micro avg) 0.7521
2023-10-16 18:50:00,132 saving best model
2023-10-16 18:50:00,641 ----------------------------------------------------------------------------------------------------
2023-10-16 18:50:02,387 epoch 9 - iter 29/292 - loss 0.01041658 - time (sec): 1.74 - samples/sec: 2808.85 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:50:04,020 epoch 9 - iter 58/292 - loss 0.01199768 - time (sec): 3.38 - samples/sec: 2761.93 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:50:05,672 epoch 9 - iter 87/292 - loss 0.01928507 - time (sec): 5.03 - samples/sec: 2709.97 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:50:07,278 epoch 9 - iter 116/292 - loss 0.01843598 - time (sec): 6.63 - samples/sec: 2756.90 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:50:09,017 epoch 9 - iter 145/292 - loss 0.01796752 - time (sec): 8.37 - samples/sec: 2783.69 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:50:10,606 epoch 9 - iter 174/292 - loss 0.01908946 - time (sec): 9.96 - samples/sec: 2745.24 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:50:12,136 epoch 9 - iter 203/292 - loss 0.01923461 - time (sec): 11.49 - samples/sec: 2702.79 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:50:13,933 epoch 9 - iter 232/292 - loss 0.01906635 - time (sec): 13.29 - samples/sec: 2695.05 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:50:15,552 epoch 9 - iter 261/292 - loss 0.02406961 - time (sec): 14.91 - samples/sec: 2696.75 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:50:17,130 epoch 9 - iter 290/292 - loss 0.02290073 - time (sec): 16.49 - samples/sec: 2689.74 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:50:17,226 ----------------------------------------------------------------------------------------------------
2023-10-16 18:50:17,226 EPOCH 9 done: loss 0.0234 - lr: 0.000003
2023-10-16 18:50:18,498 DEV : loss 0.1649204045534134 - f1-score (micro avg) 0.7579
2023-10-16 18:50:18,503 saving best model
2023-10-16 18:50:18,966 ----------------------------------------------------------------------------------------------------
2023-10-16 18:50:20,566 epoch 10 - iter 29/292 - loss 0.01048869 - time (sec): 1.60 - samples/sec: 2501.04 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:50:22,232 epoch 10 - iter 58/292 - loss 0.01035586 - time (sec): 3.26 - samples/sec: 2695.63 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:50:23,973 epoch 10 - iter 87/292 - loss 0.01530515 - time (sec): 5.01 - samples/sec: 2703.40 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:50:25,635 epoch 10 - iter 116/292 - loss 0.01435463 - time (sec): 6.67 - samples/sec: 2786.24 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:50:27,430 epoch 10 - iter 145/292 - loss 0.01677783 - time (sec): 8.46 - samples/sec: 2778.38 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:50:28,954 epoch 10 - iter 174/292 - loss 0.01671529 - time (sec): 9.99 - samples/sec: 2752.26 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:50:30,653 epoch 10 - iter 203/292 - loss 0.01614914 - time (sec): 11.69 - samples/sec: 2712.33 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:50:32,184 epoch 10 - iter 232/292 - loss 0.01780397 - time (sec): 13.22 - samples/sec: 2705.59 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:50:33,826 epoch 10 - iter 261/292 - loss 0.02122159 - time (sec): 14.86 - samples/sec: 2701.08 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:50:35,339 epoch 10 - iter 290/292 - loss 0.01997268 - time (sec): 16.37 - samples/sec: 2697.63 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:50:35,433 ----------------------------------------------------------------------------------------------------
2023-10-16 18:50:35,433 EPOCH 10 done: loss 0.0199 - lr: 0.000000
2023-10-16 18:50:36,707 DEV : loss 0.16057094931602478 - f1-score (micro avg) 0.74
2023-10-16 18:50:37,097 ----------------------------------------------------------------------------------------------------
2023-10-16 18:50:37,098 Loading model from best epoch ...
2023-10-16 18:50:38,880 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 18:50:41,349
Results:
- F-score (micro) 0.7441
- F-score (macro) 0.6612
- Accuracy 0.6157
By class:
precision recall f1-score support
PER 0.7775 0.8534 0.8137 348
LOC 0.6564 0.8199 0.7291 261
ORG 0.3393 0.3654 0.3519 52
HumanProd 0.6923 0.8182 0.7500 22
micro avg 0.6937 0.8023 0.7441 683
macro avg 0.6164 0.7142 0.6612 683
weighted avg 0.6951 0.8023 0.7442 683
2023-10-16 18:50:41,350 ----------------------------------------------------------------------------------------------------