stefan-it's picture
Upload folder using huggingface_hub
931e259
2023-10-16 18:23:52,773 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:52,774 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 18:23:52,774 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:52,774 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:23:52,774 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:52,774 Train: 1166 sentences
2023-10-16 18:23:52,774 (train_with_dev=False, train_with_test=False)
2023-10-16 18:23:52,774 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:52,774 Training Params:
2023-10-16 18:23:52,774 - learning_rate: "5e-05"
2023-10-16 18:23:52,774 - mini_batch_size: "4"
2023-10-16 18:23:52,774 - max_epochs: "10"
2023-10-16 18:23:52,774 - shuffle: "True"
2023-10-16 18:23:52,774 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:52,774 Plugins:
2023-10-16 18:23:52,775 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:23:52,775 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:52,775 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:23:52,775 - metric: "('micro avg', 'f1-score')"
2023-10-16 18:23:52,775 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:52,775 Computation:
2023-10-16 18:23:52,775 - compute on device: cuda:0
2023-10-16 18:23:52,775 - embedding storage: none
2023-10-16 18:23:52,775 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:52,775 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-16 18:23:52,775 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:52,775 ----------------------------------------------------------------------------------------------------
2023-10-16 18:23:54,355 epoch 1 - iter 29/292 - loss 2.82861770 - time (sec): 1.58 - samples/sec: 2569.51 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:23:55,895 epoch 1 - iter 58/292 - loss 2.27049059 - time (sec): 3.12 - samples/sec: 2418.01 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:23:57,801 epoch 1 - iter 87/292 - loss 1.50862869 - time (sec): 5.02 - samples/sec: 2581.74 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:23:59,391 epoch 1 - iter 116/292 - loss 1.29469874 - time (sec): 6.62 - samples/sec: 2575.35 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:24:00,959 epoch 1 - iter 145/292 - loss 1.16113362 - time (sec): 8.18 - samples/sec: 2577.50 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:24:02,460 epoch 1 - iter 174/292 - loss 1.04202470 - time (sec): 9.68 - samples/sec: 2554.20 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:24:04,304 epoch 1 - iter 203/292 - loss 0.92592921 - time (sec): 11.53 - samples/sec: 2608.57 - lr: 0.000035 - momentum: 0.000000
2023-10-16 18:24:06,039 epoch 1 - iter 232/292 - loss 0.82865037 - time (sec): 13.26 - samples/sec: 2633.60 - lr: 0.000040 - momentum: 0.000000
2023-10-16 18:24:07,694 epoch 1 - iter 261/292 - loss 0.76127572 - time (sec): 14.92 - samples/sec: 2634.58 - lr: 0.000045 - momentum: 0.000000
2023-10-16 18:24:09,386 epoch 1 - iter 290/292 - loss 0.70436125 - time (sec): 16.61 - samples/sec: 2661.85 - lr: 0.000049 - momentum: 0.000000
2023-10-16 18:24:09,480 ----------------------------------------------------------------------------------------------------
2023-10-16 18:24:09,480 EPOCH 1 done: loss 0.7015 - lr: 0.000049
2023-10-16 18:24:10,491 DEV : loss 0.198698028922081 - f1-score (micro avg) 0.4183
2023-10-16 18:24:10,496 saving best model
2023-10-16 18:24:10,894 ----------------------------------------------------------------------------------------------------
2023-10-16 18:24:12,579 epoch 2 - iter 29/292 - loss 0.21870353 - time (sec): 1.68 - samples/sec: 2890.96 - lr: 0.000049 - momentum: 0.000000
2023-10-16 18:24:14,360 epoch 2 - iter 58/292 - loss 0.19710288 - time (sec): 3.46 - samples/sec: 2792.34 - lr: 0.000049 - momentum: 0.000000
2023-10-16 18:24:15,863 epoch 2 - iter 87/292 - loss 0.20636583 - time (sec): 4.97 - samples/sec: 2748.51 - lr: 0.000048 - momentum: 0.000000
2023-10-16 18:24:17,463 epoch 2 - iter 116/292 - loss 0.19745140 - time (sec): 6.57 - samples/sec: 2680.86 - lr: 0.000048 - momentum: 0.000000
2023-10-16 18:24:19,192 epoch 2 - iter 145/292 - loss 0.19457600 - time (sec): 8.30 - samples/sec: 2670.91 - lr: 0.000047 - momentum: 0.000000
2023-10-16 18:24:20,873 epoch 2 - iter 174/292 - loss 0.19655143 - time (sec): 9.98 - samples/sec: 2701.15 - lr: 0.000047 - momentum: 0.000000
2023-10-16 18:24:22,548 epoch 2 - iter 203/292 - loss 0.19279805 - time (sec): 11.65 - samples/sec: 2695.02 - lr: 0.000046 - momentum: 0.000000
2023-10-16 18:24:24,046 epoch 2 - iter 232/292 - loss 0.18956283 - time (sec): 13.15 - samples/sec: 2679.43 - lr: 0.000046 - momentum: 0.000000
2023-10-16 18:24:25,691 epoch 2 - iter 261/292 - loss 0.18794363 - time (sec): 14.80 - samples/sec: 2702.80 - lr: 0.000045 - momentum: 0.000000
2023-10-16 18:24:27,392 epoch 2 - iter 290/292 - loss 0.18131152 - time (sec): 16.50 - samples/sec: 2688.04 - lr: 0.000045 - momentum: 0.000000
2023-10-16 18:24:27,495 ----------------------------------------------------------------------------------------------------
2023-10-16 18:24:27,495 EPOCH 2 done: loss 0.1809 - lr: 0.000045
2023-10-16 18:24:28,829 DEV : loss 0.12991848587989807 - f1-score (micro avg) 0.6681
2023-10-16 18:24:28,836 saving best model
2023-10-16 18:24:29,353 ----------------------------------------------------------------------------------------------------
2023-10-16 18:24:31,427 epoch 3 - iter 29/292 - loss 0.14634692 - time (sec): 2.07 - samples/sec: 2612.34 - lr: 0.000044 - momentum: 0.000000
2023-10-16 18:24:33,032 epoch 3 - iter 58/292 - loss 0.15575082 - time (sec): 3.68 - samples/sec: 2612.36 - lr: 0.000043 - momentum: 0.000000
2023-10-16 18:24:34,896 epoch 3 - iter 87/292 - loss 0.14045427 - time (sec): 5.54 - samples/sec: 2653.09 - lr: 0.000043 - momentum: 0.000000
2023-10-16 18:24:36,530 epoch 3 - iter 116/292 - loss 0.12675597 - time (sec): 7.17 - samples/sec: 2688.94 - lr: 0.000042 - momentum: 0.000000
2023-10-16 18:24:38,274 epoch 3 - iter 145/292 - loss 0.12307056 - time (sec): 8.92 - samples/sec: 2729.57 - lr: 0.000042 - momentum: 0.000000
2023-10-16 18:24:39,809 epoch 3 - iter 174/292 - loss 0.12034832 - time (sec): 10.45 - samples/sec: 2692.55 - lr: 0.000041 - momentum: 0.000000
2023-10-16 18:24:41,378 epoch 3 - iter 203/292 - loss 0.11494198 - time (sec): 12.02 - samples/sec: 2652.33 - lr: 0.000041 - momentum: 0.000000
2023-10-16 18:24:42,914 epoch 3 - iter 232/292 - loss 0.11259923 - time (sec): 13.56 - samples/sec: 2679.55 - lr: 0.000040 - momentum: 0.000000
2023-10-16 18:24:44,531 epoch 3 - iter 261/292 - loss 0.11038783 - time (sec): 15.18 - samples/sec: 2660.06 - lr: 0.000040 - momentum: 0.000000
2023-10-16 18:24:46,091 epoch 3 - iter 290/292 - loss 0.10544311 - time (sec): 16.74 - samples/sec: 2644.00 - lr: 0.000039 - momentum: 0.000000
2023-10-16 18:24:46,179 ----------------------------------------------------------------------------------------------------
2023-10-16 18:24:46,180 EPOCH 3 done: loss 0.1058 - lr: 0.000039
2023-10-16 18:24:47,660 DEV : loss 0.14769691228866577 - f1-score (micro avg) 0.6436
2023-10-16 18:24:47,665 ----------------------------------------------------------------------------------------------------
2023-10-16 18:24:49,228 epoch 4 - iter 29/292 - loss 0.07509580 - time (sec): 1.56 - samples/sec: 2524.12 - lr: 0.000038 - momentum: 0.000000
2023-10-16 18:24:50,823 epoch 4 - iter 58/292 - loss 0.07886125 - time (sec): 3.16 - samples/sec: 2606.97 - lr: 0.000038 - momentum: 0.000000
2023-10-16 18:24:52,434 epoch 4 - iter 87/292 - loss 0.08386891 - time (sec): 4.77 - samples/sec: 2589.31 - lr: 0.000037 - momentum: 0.000000
2023-10-16 18:24:54,155 epoch 4 - iter 116/292 - loss 0.07402682 - time (sec): 6.49 - samples/sec: 2590.61 - lr: 0.000037 - momentum: 0.000000
2023-10-16 18:24:55,743 epoch 4 - iter 145/292 - loss 0.07089413 - time (sec): 8.08 - samples/sec: 2597.15 - lr: 0.000036 - momentum: 0.000000
2023-10-16 18:24:57,409 epoch 4 - iter 174/292 - loss 0.07525889 - time (sec): 9.74 - samples/sec: 2623.03 - lr: 0.000036 - momentum: 0.000000
2023-10-16 18:24:59,095 epoch 4 - iter 203/292 - loss 0.07936455 - time (sec): 11.43 - samples/sec: 2579.90 - lr: 0.000035 - momentum: 0.000000
2023-10-16 18:25:00,743 epoch 4 - iter 232/292 - loss 0.08057674 - time (sec): 13.08 - samples/sec: 2607.69 - lr: 0.000035 - momentum: 0.000000
2023-10-16 18:25:02,362 epoch 4 - iter 261/292 - loss 0.07932140 - time (sec): 14.70 - samples/sec: 2619.95 - lr: 0.000034 - momentum: 0.000000
2023-10-16 18:25:04,245 epoch 4 - iter 290/292 - loss 0.07381505 - time (sec): 16.58 - samples/sec: 2668.67 - lr: 0.000033 - momentum: 0.000000
2023-10-16 18:25:04,334 ----------------------------------------------------------------------------------------------------
2023-10-16 18:25:04,334 EPOCH 4 done: loss 0.0737 - lr: 0.000033
2023-10-16 18:25:05,677 DEV : loss 0.12834103405475616 - f1-score (micro avg) 0.7359
2023-10-16 18:25:05,683 saving best model
2023-10-16 18:25:06,300 ----------------------------------------------------------------------------------------------------
2023-10-16 18:25:08,081 epoch 5 - iter 29/292 - loss 0.07177539 - time (sec): 1.78 - samples/sec: 2425.89 - lr: 0.000033 - momentum: 0.000000
2023-10-16 18:25:09,758 epoch 5 - iter 58/292 - loss 0.06584511 - time (sec): 3.46 - samples/sec: 2491.49 - lr: 0.000032 - momentum: 0.000000
2023-10-16 18:25:11,555 epoch 5 - iter 87/292 - loss 0.05710451 - time (sec): 5.25 - samples/sec: 2522.65 - lr: 0.000032 - momentum: 0.000000
2023-10-16 18:25:13,217 epoch 5 - iter 116/292 - loss 0.05716308 - time (sec): 6.92 - samples/sec: 2475.43 - lr: 0.000031 - momentum: 0.000000
2023-10-16 18:25:14,897 epoch 5 - iter 145/292 - loss 0.05603785 - time (sec): 8.59 - samples/sec: 2511.36 - lr: 0.000031 - momentum: 0.000000
2023-10-16 18:25:16,552 epoch 5 - iter 174/292 - loss 0.05766367 - time (sec): 10.25 - samples/sec: 2515.05 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:25:18,212 epoch 5 - iter 203/292 - loss 0.05955690 - time (sec): 11.91 - samples/sec: 2543.13 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:25:19,868 epoch 5 - iter 232/292 - loss 0.05935010 - time (sec): 13.57 - samples/sec: 2596.39 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:25:21,531 epoch 5 - iter 261/292 - loss 0.05678007 - time (sec): 15.23 - samples/sec: 2597.33 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:25:23,191 epoch 5 - iter 290/292 - loss 0.05537101 - time (sec): 16.89 - samples/sec: 2620.66 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:25:23,284 ----------------------------------------------------------------------------------------------------
2023-10-16 18:25:23,284 EPOCH 5 done: loss 0.0552 - lr: 0.000028
2023-10-16 18:25:24,543 DEV : loss 0.14645273983478546 - f1-score (micro avg) 0.7352
2023-10-16 18:25:24,548 ----------------------------------------------------------------------------------------------------
2023-10-16 18:25:26,308 epoch 6 - iter 29/292 - loss 0.04452680 - time (sec): 1.76 - samples/sec: 2936.92 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:25:27,791 epoch 6 - iter 58/292 - loss 0.03953523 - time (sec): 3.24 - samples/sec: 2783.92 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:25:29,353 epoch 6 - iter 87/292 - loss 0.03594242 - time (sec): 4.80 - samples/sec: 2699.21 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:25:31,087 epoch 6 - iter 116/292 - loss 0.03355498 - time (sec): 6.54 - samples/sec: 2682.21 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:25:32,641 epoch 6 - iter 145/292 - loss 0.03011679 - time (sec): 8.09 - samples/sec: 2752.84 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:25:34,188 epoch 6 - iter 174/292 - loss 0.03277362 - time (sec): 9.64 - samples/sec: 2715.24 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:25:35,821 epoch 6 - iter 203/292 - loss 0.03272890 - time (sec): 11.27 - samples/sec: 2703.01 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:25:37,583 epoch 6 - iter 232/292 - loss 0.03614478 - time (sec): 13.03 - samples/sec: 2692.34 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:25:39,388 epoch 6 - iter 261/292 - loss 0.04456812 - time (sec): 14.84 - samples/sec: 2713.55 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:25:41,070 epoch 6 - iter 290/292 - loss 0.04463049 - time (sec): 16.52 - samples/sec: 2677.04 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:25:41,157 ----------------------------------------------------------------------------------------------------
2023-10-16 18:25:41,157 EPOCH 6 done: loss 0.0444 - lr: 0.000022
2023-10-16 18:25:42,447 DEV : loss 0.14799444377422333 - f1-score (micro avg) 0.7484
2023-10-16 18:25:42,452 saving best model
2023-10-16 18:25:42,958 ----------------------------------------------------------------------------------------------------
2023-10-16 18:25:44,786 epoch 7 - iter 29/292 - loss 0.03561790 - time (sec): 1.83 - samples/sec: 3036.38 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:25:46,453 epoch 7 - iter 58/292 - loss 0.02551739 - time (sec): 3.49 - samples/sec: 2816.66 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:25:48,083 epoch 7 - iter 87/292 - loss 0.02590902 - time (sec): 5.12 - samples/sec: 2751.37 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:25:49,738 epoch 7 - iter 116/292 - loss 0.02747668 - time (sec): 6.78 - samples/sec: 2682.94 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:25:51,399 epoch 7 - iter 145/292 - loss 0.03466284 - time (sec): 8.44 - samples/sec: 2669.94 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:25:53,181 epoch 7 - iter 174/292 - loss 0.03184043 - time (sec): 10.22 - samples/sec: 2664.73 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:25:54,741 epoch 7 - iter 203/292 - loss 0.02941846 - time (sec): 11.78 - samples/sec: 2660.51 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:25:56,439 epoch 7 - iter 232/292 - loss 0.02738168 - time (sec): 13.48 - samples/sec: 2675.12 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:25:58,056 epoch 7 - iter 261/292 - loss 0.03070679 - time (sec): 15.10 - samples/sec: 2674.75 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:25:59,711 epoch 7 - iter 290/292 - loss 0.03072038 - time (sec): 16.75 - samples/sec: 2647.79 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:25:59,802 ----------------------------------------------------------------------------------------------------
2023-10-16 18:25:59,802 EPOCH 7 done: loss 0.0310 - lr: 0.000017
2023-10-16 18:26:01,110 DEV : loss 0.19859679043293 - f1-score (micro avg) 0.7
2023-10-16 18:26:01,122 ----------------------------------------------------------------------------------------------------
2023-10-16 18:26:02,791 epoch 8 - iter 29/292 - loss 0.02307857 - time (sec): 1.67 - samples/sec: 2596.11 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:26:04,483 epoch 8 - iter 58/292 - loss 0.01807390 - time (sec): 3.36 - samples/sec: 2719.87 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:26:06,451 epoch 8 - iter 87/292 - loss 0.02050120 - time (sec): 5.33 - samples/sec: 2515.58 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:26:08,050 epoch 8 - iter 116/292 - loss 0.01843879 - time (sec): 6.93 - samples/sec: 2494.41 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:26:09,701 epoch 8 - iter 145/292 - loss 0.02220405 - time (sec): 8.58 - samples/sec: 2566.57 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:26:11,313 epoch 8 - iter 174/292 - loss 0.01973760 - time (sec): 10.19 - samples/sec: 2591.01 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:26:13,004 epoch 8 - iter 203/292 - loss 0.02021888 - time (sec): 11.88 - samples/sec: 2585.02 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:26:14,724 epoch 8 - iter 232/292 - loss 0.02196907 - time (sec): 13.60 - samples/sec: 2614.75 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:26:16,202 epoch 8 - iter 261/292 - loss 0.02142726 - time (sec): 15.08 - samples/sec: 2600.76 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:26:17,917 epoch 8 - iter 290/292 - loss 0.02133524 - time (sec): 16.79 - samples/sec: 2629.43 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:26:18,016 ----------------------------------------------------------------------------------------------------
2023-10-16 18:26:18,017 EPOCH 8 done: loss 0.0217 - lr: 0.000011
2023-10-16 18:26:19,265 DEV : loss 0.18336039781570435 - f1-score (micro avg) 0.7265
2023-10-16 18:26:19,269 ----------------------------------------------------------------------------------------------------
2023-10-16 18:26:20,781 epoch 9 - iter 29/292 - loss 0.00973806 - time (sec): 1.51 - samples/sec: 2912.72 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:26:22,671 epoch 9 - iter 58/292 - loss 0.01556620 - time (sec): 3.40 - samples/sec: 2708.61 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:26:24,279 epoch 9 - iter 87/292 - loss 0.01591292 - time (sec): 5.01 - samples/sec: 2680.62 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:26:25,971 epoch 9 - iter 116/292 - loss 0.01447470 - time (sec): 6.70 - samples/sec: 2738.05 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:26:27,665 epoch 9 - iter 145/292 - loss 0.01441480 - time (sec): 8.39 - samples/sec: 2754.42 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:26:29,466 epoch 9 - iter 174/292 - loss 0.01538477 - time (sec): 10.20 - samples/sec: 2754.27 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:26:31,091 epoch 9 - iter 203/292 - loss 0.01607027 - time (sec): 11.82 - samples/sec: 2727.62 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:26:32,659 epoch 9 - iter 232/292 - loss 0.01513598 - time (sec): 13.39 - samples/sec: 2699.68 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:26:34,245 epoch 9 - iter 261/292 - loss 0.01547309 - time (sec): 14.97 - samples/sec: 2677.88 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:26:35,816 epoch 9 - iter 290/292 - loss 0.01443794 - time (sec): 16.55 - samples/sec: 2666.50 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:26:35,924 ----------------------------------------------------------------------------------------------------
2023-10-16 18:26:35,924 EPOCH 9 done: loss 0.0148 - lr: 0.000006
2023-10-16 18:26:37,193 DEV : loss 0.18265648186206818 - f1-score (micro avg) 0.7039
2023-10-16 18:26:37,198 ----------------------------------------------------------------------------------------------------
2023-10-16 18:26:38,851 epoch 10 - iter 29/292 - loss 0.00403559 - time (sec): 1.65 - samples/sec: 2836.55 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:26:40,560 epoch 10 - iter 58/292 - loss 0.00536696 - time (sec): 3.36 - samples/sec: 2893.01 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:26:42,186 epoch 10 - iter 87/292 - loss 0.01069592 - time (sec): 4.99 - samples/sec: 2799.15 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:26:43,781 epoch 10 - iter 116/292 - loss 0.00956979 - time (sec): 6.58 - samples/sec: 2751.92 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:26:45,406 epoch 10 - iter 145/292 - loss 0.01094773 - time (sec): 8.21 - samples/sec: 2733.42 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:26:47,151 epoch 10 - iter 174/292 - loss 0.01223287 - time (sec): 9.95 - samples/sec: 2766.65 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:26:48,801 epoch 10 - iter 203/292 - loss 0.01118601 - time (sec): 11.60 - samples/sec: 2747.13 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:26:50,388 epoch 10 - iter 232/292 - loss 0.01079535 - time (sec): 13.19 - samples/sec: 2709.71 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:26:52,076 epoch 10 - iter 261/292 - loss 0.01043899 - time (sec): 14.88 - samples/sec: 2713.94 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:26:53,572 epoch 10 - iter 290/292 - loss 0.01121750 - time (sec): 16.37 - samples/sec: 2700.50 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:26:53,665 ----------------------------------------------------------------------------------------------------
2023-10-16 18:26:53,665 EPOCH 10 done: loss 0.0112 - lr: 0.000000
2023-10-16 18:26:54,953 DEV : loss 0.1769292950630188 - f1-score (micro avg) 0.7277
2023-10-16 18:26:55,295 ----------------------------------------------------------------------------------------------------
2023-10-16 18:26:55,296 Loading model from best epoch ...
2023-10-16 18:26:56,968 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 18:26:59,639
Results:
- F-score (micro) 0.7629
- F-score (macro) 0.6925
- Accuracy 0.639
By class:
precision recall f1-score support
PER 0.7989 0.8448 0.8212 348
LOC 0.6804 0.8238 0.7452 261
ORG 0.4651 0.3846 0.4211 52
HumanProd 0.7500 0.8182 0.7826 22
micro avg 0.7284 0.8009 0.7629 683
macro avg 0.6736 0.7178 0.6925 683
weighted avg 0.7266 0.8009 0.7605 683
2023-10-16 18:26:59,639 ----------------------------------------------------------------------------------------------------