stefan-it's picture
Upload folder using huggingface_hub
2016fdf
2023-10-16 18:06:43,262 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:43,263 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 18:06:43,264 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:43,264 MultiCorpus: 1166 train + 165 dev + 415 test sentences
- NER_HIPE_2022 Corpus: 1166 train + 165 dev + 415 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fi/with_doc_seperator
2023-10-16 18:06:43,264 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:43,264 Train: 1166 sentences
2023-10-16 18:06:43,264 (train_with_dev=False, train_with_test=False)
2023-10-16 18:06:43,264 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:43,264 Training Params:
2023-10-16 18:06:43,264 - learning_rate: "3e-05"
2023-10-16 18:06:43,264 - mini_batch_size: "4"
2023-10-16 18:06:43,264 - max_epochs: "10"
2023-10-16 18:06:43,264 - shuffle: "True"
2023-10-16 18:06:43,264 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:43,264 Plugins:
2023-10-16 18:06:43,264 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 18:06:43,264 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:43,264 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 18:06:43,264 - metric: "('micro avg', 'f1-score')"
2023-10-16 18:06:43,264 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:43,264 Computation:
2023-10-16 18:06:43,264 - compute on device: cuda:0
2023-10-16 18:06:43,264 - embedding storage: none
2023-10-16 18:06:43,264 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:43,264 Model training base path: "hmbench-newseye/fi-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-16 18:06:43,264 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:43,264 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:45,038 epoch 1 - iter 29/292 - loss 2.88222167 - time (sec): 1.77 - samples/sec: 2934.55 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:06:46,609 epoch 1 - iter 58/292 - loss 2.60446904 - time (sec): 3.34 - samples/sec: 2697.91 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:06:48,242 epoch 1 - iter 87/292 - loss 2.04069918 - time (sec): 4.98 - samples/sec: 2655.47 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:06:49,732 epoch 1 - iter 116/292 - loss 1.70752429 - time (sec): 6.47 - samples/sec: 2663.44 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:06:51,291 epoch 1 - iter 145/292 - loss 1.48725658 - time (sec): 8.03 - samples/sec: 2632.59 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:06:52,955 epoch 1 - iter 174/292 - loss 1.33904630 - time (sec): 9.69 - samples/sec: 2602.10 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:06:54,672 epoch 1 - iter 203/292 - loss 1.16634244 - time (sec): 11.41 - samples/sec: 2674.25 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:06:56,311 epoch 1 - iter 232/292 - loss 1.06978910 - time (sec): 13.05 - samples/sec: 2671.11 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:06:58,151 epoch 1 - iter 261/292 - loss 0.99870433 - time (sec): 14.89 - samples/sec: 2689.92 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:06:59,742 epoch 1 - iter 290/292 - loss 0.93223498 - time (sec): 16.48 - samples/sec: 2676.79 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:06:59,856 ----------------------------------------------------------------------------------------------------
2023-10-16 18:06:59,857 EPOCH 1 done: loss 0.9270 - lr: 0.000030
2023-10-16 18:07:01,225 DEV : loss 0.20976495742797852 - f1-score (micro avg) 0.3889
2023-10-16 18:07:01,232 saving best model
2023-10-16 18:07:01,752 ----------------------------------------------------------------------------------------------------
2023-10-16 18:07:03,453 epoch 2 - iter 29/292 - loss 0.25107587 - time (sec): 1.70 - samples/sec: 2491.07 - lr: 0.000030 - momentum: 0.000000
2023-10-16 18:07:05,086 epoch 2 - iter 58/292 - loss 0.24400947 - time (sec): 3.33 - samples/sec: 2503.73 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:07:06,734 epoch 2 - iter 87/292 - loss 0.23895746 - time (sec): 4.98 - samples/sec: 2492.39 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:07:08,461 epoch 2 - iter 116/292 - loss 0.23745427 - time (sec): 6.71 - samples/sec: 2479.82 - lr: 0.000029 - momentum: 0.000000
2023-10-16 18:07:10,124 epoch 2 - iter 145/292 - loss 0.23323169 - time (sec): 8.37 - samples/sec: 2511.84 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:07:11,997 epoch 2 - iter 174/292 - loss 0.23315863 - time (sec): 10.24 - samples/sec: 2574.83 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:07:13,713 epoch 2 - iter 203/292 - loss 0.22673487 - time (sec): 11.96 - samples/sec: 2610.35 - lr: 0.000028 - momentum: 0.000000
2023-10-16 18:07:15,338 epoch 2 - iter 232/292 - loss 0.22123385 - time (sec): 13.58 - samples/sec: 2631.26 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:07:16,935 epoch 2 - iter 261/292 - loss 0.22344619 - time (sec): 15.18 - samples/sec: 2632.59 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:07:18,600 epoch 2 - iter 290/292 - loss 0.21652095 - time (sec): 16.85 - samples/sec: 2631.92 - lr: 0.000027 - momentum: 0.000000
2023-10-16 18:07:18,687 ----------------------------------------------------------------------------------------------------
2023-10-16 18:07:18,687 EPOCH 2 done: loss 0.2162 - lr: 0.000027
2023-10-16 18:07:19,989 DEV : loss 0.14250923693180084 - f1-score (micro avg) 0.6128
2023-10-16 18:07:19,995 saving best model
2023-10-16 18:07:20,524 ----------------------------------------------------------------------------------------------------
2023-10-16 18:07:22,235 epoch 3 - iter 29/292 - loss 0.14272483 - time (sec): 1.71 - samples/sec: 2548.95 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:07:23,769 epoch 3 - iter 58/292 - loss 0.12941947 - time (sec): 3.24 - samples/sec: 2716.78 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:07:25,434 epoch 3 - iter 87/292 - loss 0.13036619 - time (sec): 4.91 - samples/sec: 2744.37 - lr: 0.000026 - momentum: 0.000000
2023-10-16 18:07:26,970 epoch 3 - iter 116/292 - loss 0.12139105 - time (sec): 6.44 - samples/sec: 2681.79 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:07:28,568 epoch 3 - iter 145/292 - loss 0.11496517 - time (sec): 8.04 - samples/sec: 2685.19 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:07:30,170 epoch 3 - iter 174/292 - loss 0.11820563 - time (sec): 9.64 - samples/sec: 2672.93 - lr: 0.000025 - momentum: 0.000000
2023-10-16 18:07:32,002 epoch 3 - iter 203/292 - loss 0.12100791 - time (sec): 11.48 - samples/sec: 2706.76 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:07:33,666 epoch 3 - iter 232/292 - loss 0.11592913 - time (sec): 13.14 - samples/sec: 2704.67 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:07:35,220 epoch 3 - iter 261/292 - loss 0.11476202 - time (sec): 14.69 - samples/sec: 2706.54 - lr: 0.000024 - momentum: 0.000000
2023-10-16 18:07:37,049 epoch 3 - iter 290/292 - loss 0.11381579 - time (sec): 16.52 - samples/sec: 2679.81 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:07:37,143 ----------------------------------------------------------------------------------------------------
2023-10-16 18:07:37,143 EPOCH 3 done: loss 0.1135 - lr: 0.000023
2023-10-16 18:07:38,429 DEV : loss 0.12919388711452484 - f1-score (micro avg) 0.6814
2023-10-16 18:07:38,436 saving best model
2023-10-16 18:07:38,944 ----------------------------------------------------------------------------------------------------
2023-10-16 18:07:40,795 epoch 4 - iter 29/292 - loss 0.08033603 - time (sec): 1.85 - samples/sec: 2760.24 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:07:42,390 epoch 4 - iter 58/292 - loss 0.09107035 - time (sec): 3.44 - samples/sec: 2756.18 - lr: 0.000023 - momentum: 0.000000
2023-10-16 18:07:43,995 epoch 4 - iter 87/292 - loss 0.08192222 - time (sec): 5.05 - samples/sec: 2743.24 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:07:45,617 epoch 4 - iter 116/292 - loss 0.08082958 - time (sec): 6.67 - samples/sec: 2766.29 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:07:47,371 epoch 4 - iter 145/292 - loss 0.07535777 - time (sec): 8.43 - samples/sec: 2795.29 - lr: 0.000022 - momentum: 0.000000
2023-10-16 18:07:49,099 epoch 4 - iter 174/292 - loss 0.07310859 - time (sec): 10.15 - samples/sec: 2773.92 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:07:50,665 epoch 4 - iter 203/292 - loss 0.07590109 - time (sec): 11.72 - samples/sec: 2763.34 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:07:52,377 epoch 4 - iter 232/292 - loss 0.07416772 - time (sec): 13.43 - samples/sec: 2702.56 - lr: 0.000021 - momentum: 0.000000
2023-10-16 18:07:54,003 epoch 4 - iter 261/292 - loss 0.07169329 - time (sec): 15.06 - samples/sec: 2706.89 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:07:55,724 epoch 4 - iter 290/292 - loss 0.06896376 - time (sec): 16.78 - samples/sec: 2641.64 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:07:55,807 ----------------------------------------------------------------------------------------------------
2023-10-16 18:07:55,807 EPOCH 4 done: loss 0.0688 - lr: 0.000020
2023-10-16 18:07:57,088 DEV : loss 0.12307216227054596 - f1-score (micro avg) 0.7595
2023-10-16 18:07:57,094 saving best model
2023-10-16 18:07:57,646 ----------------------------------------------------------------------------------------------------
2023-10-16 18:07:59,313 epoch 5 - iter 29/292 - loss 0.03955736 - time (sec): 1.67 - samples/sec: 2868.56 - lr: 0.000020 - momentum: 0.000000
2023-10-16 18:08:01,015 epoch 5 - iter 58/292 - loss 0.04332368 - time (sec): 3.37 - samples/sec: 2807.90 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:08:02,822 epoch 5 - iter 87/292 - loss 0.04161002 - time (sec): 5.17 - samples/sec: 2811.24 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:08:04,459 epoch 5 - iter 116/292 - loss 0.03749425 - time (sec): 6.81 - samples/sec: 2772.83 - lr: 0.000019 - momentum: 0.000000
2023-10-16 18:08:05,992 epoch 5 - iter 145/292 - loss 0.03825239 - time (sec): 8.34 - samples/sec: 2738.86 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:08:07,530 epoch 5 - iter 174/292 - loss 0.03808185 - time (sec): 9.88 - samples/sec: 2697.02 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:08:09,124 epoch 5 - iter 203/292 - loss 0.03880310 - time (sec): 11.48 - samples/sec: 2671.49 - lr: 0.000018 - momentum: 0.000000
2023-10-16 18:08:10,845 epoch 5 - iter 232/292 - loss 0.04783043 - time (sec): 13.20 - samples/sec: 2665.68 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:08:12,530 epoch 5 - iter 261/292 - loss 0.04825293 - time (sec): 14.88 - samples/sec: 2628.51 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:08:14,265 epoch 5 - iter 290/292 - loss 0.05008356 - time (sec): 16.62 - samples/sec: 2662.54 - lr: 0.000017 - momentum: 0.000000
2023-10-16 18:08:14,357 ----------------------------------------------------------------------------------------------------
2023-10-16 18:08:14,358 EPOCH 5 done: loss 0.0499 - lr: 0.000017
2023-10-16 18:08:15,621 DEV : loss 0.1350400596857071 - f1-score (micro avg) 0.766
2023-10-16 18:08:15,626 saving best model
2023-10-16 18:08:16,115 ----------------------------------------------------------------------------------------------------
2023-10-16 18:08:17,779 epoch 6 - iter 29/292 - loss 0.04105101 - time (sec): 1.66 - samples/sec: 2305.85 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:08:19,486 epoch 6 - iter 58/292 - loss 0.04502925 - time (sec): 3.37 - samples/sec: 2453.67 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:08:21,059 epoch 6 - iter 87/292 - loss 0.03516229 - time (sec): 4.94 - samples/sec: 2487.33 - lr: 0.000016 - momentum: 0.000000
2023-10-16 18:08:22,563 epoch 6 - iter 116/292 - loss 0.03210820 - time (sec): 6.45 - samples/sec: 2555.05 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:08:24,329 epoch 6 - iter 145/292 - loss 0.02906604 - time (sec): 8.21 - samples/sec: 2574.10 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:08:25,816 epoch 6 - iter 174/292 - loss 0.02948663 - time (sec): 9.70 - samples/sec: 2553.14 - lr: 0.000015 - momentum: 0.000000
2023-10-16 18:08:27,546 epoch 6 - iter 203/292 - loss 0.03338297 - time (sec): 11.43 - samples/sec: 2570.13 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:08:29,334 epoch 6 - iter 232/292 - loss 0.03431716 - time (sec): 13.22 - samples/sec: 2607.40 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:08:31,046 epoch 6 - iter 261/292 - loss 0.03709105 - time (sec): 14.93 - samples/sec: 2646.56 - lr: 0.000014 - momentum: 0.000000
2023-10-16 18:08:32,738 epoch 6 - iter 290/292 - loss 0.03590560 - time (sec): 16.62 - samples/sec: 2660.19 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:08:32,826 ----------------------------------------------------------------------------------------------------
2023-10-16 18:08:32,826 EPOCH 6 done: loss 0.0359 - lr: 0.000013
2023-10-16 18:08:34,041 DEV : loss 0.1274399310350418 - f1-score (micro avg) 0.8009
2023-10-16 18:08:34,045 saving best model
2023-10-16 18:08:34,569 ----------------------------------------------------------------------------------------------------
2023-10-16 18:08:36,179 epoch 7 - iter 29/292 - loss 0.03485645 - time (sec): 1.61 - samples/sec: 2550.22 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:08:37,955 epoch 7 - iter 58/292 - loss 0.03354645 - time (sec): 3.38 - samples/sec: 2752.99 - lr: 0.000013 - momentum: 0.000000
2023-10-16 18:08:39,678 epoch 7 - iter 87/292 - loss 0.02845269 - time (sec): 5.11 - samples/sec: 2774.75 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:08:41,459 epoch 7 - iter 116/292 - loss 0.03214827 - time (sec): 6.89 - samples/sec: 2730.26 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:08:43,094 epoch 7 - iter 145/292 - loss 0.03051476 - time (sec): 8.52 - samples/sec: 2687.09 - lr: 0.000012 - momentum: 0.000000
2023-10-16 18:08:44,601 epoch 7 - iter 174/292 - loss 0.02794386 - time (sec): 10.03 - samples/sec: 2685.33 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:08:46,149 epoch 7 - iter 203/292 - loss 0.02570380 - time (sec): 11.58 - samples/sec: 2670.47 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:08:47,783 epoch 7 - iter 232/292 - loss 0.02741901 - time (sec): 13.21 - samples/sec: 2705.87 - lr: 0.000011 - momentum: 0.000000
2023-10-16 18:08:49,317 epoch 7 - iter 261/292 - loss 0.02628470 - time (sec): 14.75 - samples/sec: 2692.65 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:08:51,015 epoch 7 - iter 290/292 - loss 0.02776424 - time (sec): 16.44 - samples/sec: 2685.07 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:08:51,122 ----------------------------------------------------------------------------------------------------
2023-10-16 18:08:51,123 EPOCH 7 done: loss 0.0276 - lr: 0.000010
2023-10-16 18:08:52,409 DEV : loss 0.14393527805805206 - f1-score (micro avg) 0.7603
2023-10-16 18:08:52,413 ----------------------------------------------------------------------------------------------------
2023-10-16 18:08:54,121 epoch 8 - iter 29/292 - loss 0.02543747 - time (sec): 1.71 - samples/sec: 2790.30 - lr: 0.000010 - momentum: 0.000000
2023-10-16 18:08:55,512 epoch 8 - iter 58/292 - loss 0.02833714 - time (sec): 3.10 - samples/sec: 2578.54 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:08:57,277 epoch 8 - iter 87/292 - loss 0.02513038 - time (sec): 4.86 - samples/sec: 2585.81 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:08:59,029 epoch 8 - iter 116/292 - loss 0.02477383 - time (sec): 6.61 - samples/sec: 2622.43 - lr: 0.000009 - momentum: 0.000000
2023-10-16 18:09:00,644 epoch 8 - iter 145/292 - loss 0.02605667 - time (sec): 8.23 - samples/sec: 2653.35 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:09:02,344 epoch 8 - iter 174/292 - loss 0.02378070 - time (sec): 9.93 - samples/sec: 2690.71 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:09:04,137 epoch 8 - iter 203/292 - loss 0.02209407 - time (sec): 11.72 - samples/sec: 2726.41 - lr: 0.000008 - momentum: 0.000000
2023-10-16 18:09:05,761 epoch 8 - iter 232/292 - loss 0.02304147 - time (sec): 13.35 - samples/sec: 2733.41 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:09:07,268 epoch 8 - iter 261/292 - loss 0.02152923 - time (sec): 14.85 - samples/sec: 2696.49 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:09:08,856 epoch 8 - iter 290/292 - loss 0.02045988 - time (sec): 16.44 - samples/sec: 2689.85 - lr: 0.000007 - momentum: 0.000000
2023-10-16 18:09:08,958 ----------------------------------------------------------------------------------------------------
2023-10-16 18:09:08,958 EPOCH 8 done: loss 0.0204 - lr: 0.000007
2023-10-16 18:09:10,489 DEV : loss 0.153466135263443 - f1-score (micro avg) 0.7689
2023-10-16 18:09:10,493 ----------------------------------------------------------------------------------------------------
2023-10-16 18:09:12,364 epoch 9 - iter 29/292 - loss 0.00758189 - time (sec): 1.87 - samples/sec: 2919.25 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:09:14,012 epoch 9 - iter 58/292 - loss 0.01509998 - time (sec): 3.52 - samples/sec: 2654.32 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:09:15,852 epoch 9 - iter 87/292 - loss 0.02017623 - time (sec): 5.36 - samples/sec: 2611.26 - lr: 0.000006 - momentum: 0.000000
2023-10-16 18:09:17,652 epoch 9 - iter 116/292 - loss 0.02170409 - time (sec): 7.16 - samples/sec: 2590.86 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:09:19,465 epoch 9 - iter 145/292 - loss 0.01832989 - time (sec): 8.97 - samples/sec: 2530.51 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:09:21,042 epoch 9 - iter 174/292 - loss 0.01817301 - time (sec): 10.55 - samples/sec: 2525.19 - lr: 0.000005 - momentum: 0.000000
2023-10-16 18:09:22,702 epoch 9 - iter 203/292 - loss 0.01751352 - time (sec): 12.21 - samples/sec: 2524.94 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:09:24,524 epoch 9 - iter 232/292 - loss 0.01770380 - time (sec): 14.03 - samples/sec: 2524.00 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:09:26,227 epoch 9 - iter 261/292 - loss 0.01613702 - time (sec): 15.73 - samples/sec: 2538.20 - lr: 0.000004 - momentum: 0.000000
2023-10-16 18:09:27,928 epoch 9 - iter 290/292 - loss 0.01569710 - time (sec): 17.43 - samples/sec: 2542.91 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:09:28,012 ----------------------------------------------------------------------------------------------------
2023-10-16 18:09:28,012 EPOCH 9 done: loss 0.0157 - lr: 0.000003
2023-10-16 18:09:29,289 DEV : loss 0.1515672355890274 - f1-score (micro avg) 0.756
2023-10-16 18:09:29,294 ----------------------------------------------------------------------------------------------------
2023-10-16 18:09:30,951 epoch 10 - iter 29/292 - loss 0.00834902 - time (sec): 1.66 - samples/sec: 2534.54 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:09:32,573 epoch 10 - iter 58/292 - loss 0.00778017 - time (sec): 3.28 - samples/sec: 2506.92 - lr: 0.000003 - momentum: 0.000000
2023-10-16 18:09:34,325 epoch 10 - iter 87/292 - loss 0.00669187 - time (sec): 5.03 - samples/sec: 2503.73 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:09:36,116 epoch 10 - iter 116/292 - loss 0.00906403 - time (sec): 6.82 - samples/sec: 2544.52 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:09:37,714 epoch 10 - iter 145/292 - loss 0.00978993 - time (sec): 8.42 - samples/sec: 2531.12 - lr: 0.000002 - momentum: 0.000000
2023-10-16 18:09:39,369 epoch 10 - iter 174/292 - loss 0.01122421 - time (sec): 10.07 - samples/sec: 2571.88 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:09:40,888 epoch 10 - iter 203/292 - loss 0.01027289 - time (sec): 11.59 - samples/sec: 2600.77 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:09:42,713 epoch 10 - iter 232/292 - loss 0.00980163 - time (sec): 13.42 - samples/sec: 2584.77 - lr: 0.000001 - momentum: 0.000000
2023-10-16 18:09:44,267 epoch 10 - iter 261/292 - loss 0.00981947 - time (sec): 14.97 - samples/sec: 2603.16 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:09:46,020 epoch 10 - iter 290/292 - loss 0.01181952 - time (sec): 16.73 - samples/sec: 2643.22 - lr: 0.000000 - momentum: 0.000000
2023-10-16 18:09:46,120 ----------------------------------------------------------------------------------------------------
2023-10-16 18:09:46,120 EPOCH 10 done: loss 0.0124 - lr: 0.000000
2023-10-16 18:09:47,424 DEV : loss 0.15941958129405975 - f1-score (micro avg) 0.7452
2023-10-16 18:09:47,808 ----------------------------------------------------------------------------------------------------
2023-10-16 18:09:47,810 Loading model from best epoch ...
2023-10-16 18:09:49,528 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 18:09:52,319
Results:
- F-score (micro) 0.7547
- F-score (macro) 0.6975
- Accuracy 0.6325
By class:
precision recall f1-score support
PER 0.8187 0.8305 0.8245 348
LOC 0.6433 0.8084 0.7165 261
ORG 0.5128 0.3846 0.4396 52
HumanProd 0.8500 0.7727 0.8095 22
micro avg 0.7257 0.7862 0.7547 683
macro avg 0.7062 0.6991 0.6975 683
weighted avg 0.7294 0.7862 0.7534 683
2023-10-16 18:09:52,319 ----------------------------------------------------------------------------------------------------