stefan-it's picture
Upload ./training.log with huggingface_hub
62185ba
2023-10-25 14:50:02,212 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,212 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(64001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-25 14:50:02,213 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,213 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-25 14:50:02,213 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,213 Train: 7142 sentences
2023-10-25 14:50:02,213 (train_with_dev=False, train_with_test=False)
2023-10-25 14:50:02,213 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,213 Training Params:
2023-10-25 14:50:02,213 - learning_rate: "5e-05"
2023-10-25 14:50:02,213 - mini_batch_size: "4"
2023-10-25 14:50:02,213 - max_epochs: "10"
2023-10-25 14:50:02,213 - shuffle: "True"
2023-10-25 14:50:02,213 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,213 Plugins:
2023-10-25 14:50:02,213 - TensorboardLogger
2023-10-25 14:50:02,213 - LinearScheduler | warmup_fraction: '0.1'
2023-10-25 14:50:02,213 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,213 Final evaluation on model from best epoch (best-model.pt)
2023-10-25 14:50:02,214 - metric: "('micro avg', 'f1-score')"
2023-10-25 14:50:02,214 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,214 Computation:
2023-10-25 14:50:02,214 - compute on device: cuda:0
2023-10-25 14:50:02,214 - embedding storage: none
2023-10-25 14:50:02,214 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,214 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-64k-td-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-25 14:50:02,214 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,214 ----------------------------------------------------------------------------------------------------
2023-10-25 14:50:02,214 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-25 14:50:11,883 epoch 1 - iter 178/1786 - loss 1.84783869 - time (sec): 9.67 - samples/sec: 2583.61 - lr: 0.000005 - momentum: 0.000000
2023-10-25 14:50:21,608 epoch 1 - iter 356/1786 - loss 1.11625612 - time (sec): 19.39 - samples/sec: 2603.59 - lr: 0.000010 - momentum: 0.000000
2023-10-25 14:50:31,390 epoch 1 - iter 534/1786 - loss 0.83906654 - time (sec): 29.18 - samples/sec: 2568.92 - lr: 0.000015 - momentum: 0.000000
2023-10-25 14:50:40,838 epoch 1 - iter 712/1786 - loss 0.68032105 - time (sec): 38.62 - samples/sec: 2597.39 - lr: 0.000020 - momentum: 0.000000
2023-10-25 14:50:50,185 epoch 1 - iter 890/1786 - loss 0.58377808 - time (sec): 47.97 - samples/sec: 2608.11 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:50:59,347 epoch 1 - iter 1068/1786 - loss 0.51796911 - time (sec): 57.13 - samples/sec: 2609.43 - lr: 0.000030 - momentum: 0.000000
2023-10-25 14:51:08,975 epoch 1 - iter 1246/1786 - loss 0.46226937 - time (sec): 66.76 - samples/sec: 2625.70 - lr: 0.000035 - momentum: 0.000000
2023-10-25 14:51:18,683 epoch 1 - iter 1424/1786 - loss 0.42498520 - time (sec): 76.47 - samples/sec: 2601.98 - lr: 0.000040 - momentum: 0.000000
2023-10-25 14:51:27,888 epoch 1 - iter 1602/1786 - loss 0.39393828 - time (sec): 85.67 - samples/sec: 2609.03 - lr: 0.000045 - momentum: 0.000000
2023-10-25 14:51:37,261 epoch 1 - iter 1780/1786 - loss 0.37225174 - time (sec): 95.05 - samples/sec: 2607.72 - lr: 0.000050 - momentum: 0.000000
2023-10-25 14:51:37,609 ----------------------------------------------------------------------------------------------------
2023-10-25 14:51:37,610 EPOCH 1 done: loss 0.3712 - lr: 0.000050
2023-10-25 14:51:41,345 DEV : loss 0.12383320182561874 - f1-score (micro avg) 0.692
2023-10-25 14:51:41,367 saving best model
2023-10-25 14:51:41,834 ----------------------------------------------------------------------------------------------------
2023-10-25 14:51:50,898 epoch 2 - iter 178/1786 - loss 0.11440561 - time (sec): 9.06 - samples/sec: 2722.95 - lr: 0.000049 - momentum: 0.000000
2023-10-25 14:51:59,829 epoch 2 - iter 356/1786 - loss 0.12055337 - time (sec): 17.99 - samples/sec: 2748.37 - lr: 0.000049 - momentum: 0.000000
2023-10-25 14:52:09,589 epoch 2 - iter 534/1786 - loss 0.12495791 - time (sec): 27.75 - samples/sec: 2647.01 - lr: 0.000048 - momentum: 0.000000
2023-10-25 14:52:19,211 epoch 2 - iter 712/1786 - loss 0.12559829 - time (sec): 37.38 - samples/sec: 2624.13 - lr: 0.000048 - momentum: 0.000000
2023-10-25 14:52:28,850 epoch 2 - iter 890/1786 - loss 0.11968865 - time (sec): 47.01 - samples/sec: 2603.47 - lr: 0.000047 - momentum: 0.000000
2023-10-25 14:52:38,347 epoch 2 - iter 1068/1786 - loss 0.12384761 - time (sec): 56.51 - samples/sec: 2593.27 - lr: 0.000047 - momentum: 0.000000
2023-10-25 14:52:47,363 epoch 2 - iter 1246/1786 - loss 0.12564491 - time (sec): 65.53 - samples/sec: 2626.45 - lr: 0.000046 - momentum: 0.000000
2023-10-25 14:52:56,330 epoch 2 - iter 1424/1786 - loss 0.12415748 - time (sec): 74.50 - samples/sec: 2659.00 - lr: 0.000046 - momentum: 0.000000
2023-10-25 14:53:05,248 epoch 2 - iter 1602/1786 - loss 0.12176852 - time (sec): 83.41 - samples/sec: 2685.88 - lr: 0.000045 - momentum: 0.000000
2023-10-25 14:53:14,581 epoch 2 - iter 1780/1786 - loss 0.12143837 - time (sec): 92.75 - samples/sec: 2672.87 - lr: 0.000044 - momentum: 0.000000
2023-10-25 14:53:14,882 ----------------------------------------------------------------------------------------------------
2023-10-25 14:53:14,883 EPOCH 2 done: loss 0.1213 - lr: 0.000044
2023-10-25 14:53:20,002 DEV : loss 0.15896384418010712 - f1-score (micro avg) 0.7482
2023-10-25 14:53:20,025 saving best model
2023-10-25 14:53:20,700 ----------------------------------------------------------------------------------------------------
2023-10-25 14:53:29,708 epoch 3 - iter 178/1786 - loss 0.07747975 - time (sec): 9.01 - samples/sec: 2728.60 - lr: 0.000044 - momentum: 0.000000
2023-10-25 14:53:39,305 epoch 3 - iter 356/1786 - loss 0.08659897 - time (sec): 18.60 - samples/sec: 2753.37 - lr: 0.000043 - momentum: 0.000000
2023-10-25 14:53:48,734 epoch 3 - iter 534/1786 - loss 0.08333401 - time (sec): 28.03 - samples/sec: 2690.93 - lr: 0.000043 - momentum: 0.000000
2023-10-25 14:53:58,266 epoch 3 - iter 712/1786 - loss 0.08421054 - time (sec): 37.56 - samples/sec: 2649.55 - lr: 0.000042 - momentum: 0.000000
2023-10-25 14:54:07,433 epoch 3 - iter 890/1786 - loss 0.08466478 - time (sec): 46.73 - samples/sec: 2634.36 - lr: 0.000042 - momentum: 0.000000
2023-10-25 14:54:16,555 epoch 3 - iter 1068/1786 - loss 0.08490863 - time (sec): 55.85 - samples/sec: 2649.10 - lr: 0.000041 - momentum: 0.000000
2023-10-25 14:54:25,902 epoch 3 - iter 1246/1786 - loss 0.08422222 - time (sec): 65.20 - samples/sec: 2663.92 - lr: 0.000041 - momentum: 0.000000
2023-10-25 14:54:34,984 epoch 3 - iter 1424/1786 - loss 0.08498648 - time (sec): 74.28 - samples/sec: 2675.02 - lr: 0.000040 - momentum: 0.000000
2023-10-25 14:54:44,185 epoch 3 - iter 1602/1786 - loss 0.08479702 - time (sec): 83.48 - samples/sec: 2669.15 - lr: 0.000039 - momentum: 0.000000
2023-10-25 14:54:53,654 epoch 3 - iter 1780/1786 - loss 0.08642108 - time (sec): 92.95 - samples/sec: 2669.53 - lr: 0.000039 - momentum: 0.000000
2023-10-25 14:54:53,978 ----------------------------------------------------------------------------------------------------
2023-10-25 14:54:53,978 EPOCH 3 done: loss 0.0865 - lr: 0.000039
2023-10-25 14:54:57,810 DEV : loss 0.13499563932418823 - f1-score (micro avg) 0.7639
2023-10-25 14:54:57,834 saving best model
2023-10-25 14:54:58,513 ----------------------------------------------------------------------------------------------------
2023-10-25 14:55:08,213 epoch 4 - iter 178/1786 - loss 0.07787612 - time (sec): 9.70 - samples/sec: 2662.18 - lr: 0.000038 - momentum: 0.000000
2023-10-25 14:55:17,762 epoch 4 - iter 356/1786 - loss 0.07006565 - time (sec): 19.25 - samples/sec: 2597.01 - lr: 0.000038 - momentum: 0.000000
2023-10-25 14:55:27,282 epoch 4 - iter 534/1786 - loss 0.06777304 - time (sec): 28.77 - samples/sec: 2561.15 - lr: 0.000037 - momentum: 0.000000
2023-10-25 14:55:36,832 epoch 4 - iter 712/1786 - loss 0.06334353 - time (sec): 38.32 - samples/sec: 2605.60 - lr: 0.000037 - momentum: 0.000000
2023-10-25 14:55:46,207 epoch 4 - iter 890/1786 - loss 0.06251813 - time (sec): 47.69 - samples/sec: 2628.71 - lr: 0.000036 - momentum: 0.000000
2023-10-25 14:55:55,932 epoch 4 - iter 1068/1786 - loss 0.06251651 - time (sec): 57.42 - samples/sec: 2599.12 - lr: 0.000036 - momentum: 0.000000
2023-10-25 14:56:05,686 epoch 4 - iter 1246/1786 - loss 0.06255099 - time (sec): 67.17 - samples/sec: 2585.14 - lr: 0.000035 - momentum: 0.000000
2023-10-25 14:56:15,297 epoch 4 - iter 1424/1786 - loss 0.06202994 - time (sec): 76.78 - samples/sec: 2582.61 - lr: 0.000034 - momentum: 0.000000
2023-10-25 14:56:25,232 epoch 4 - iter 1602/1786 - loss 0.06194081 - time (sec): 86.72 - samples/sec: 2583.68 - lr: 0.000034 - momentum: 0.000000
2023-10-25 14:56:34,873 epoch 4 - iter 1780/1786 - loss 0.06270639 - time (sec): 96.36 - samples/sec: 2573.68 - lr: 0.000033 - momentum: 0.000000
2023-10-25 14:56:35,198 ----------------------------------------------------------------------------------------------------
2023-10-25 14:56:35,198 EPOCH 4 done: loss 0.0628 - lr: 0.000033
2023-10-25 14:56:39,818 DEV : loss 0.18497972190380096 - f1-score (micro avg) 0.7612
2023-10-25 14:56:39,839 ----------------------------------------------------------------------------------------------------
2023-10-25 14:56:49,615 epoch 5 - iter 178/1786 - loss 0.05664153 - time (sec): 9.77 - samples/sec: 2702.46 - lr: 0.000033 - momentum: 0.000000
2023-10-25 14:56:59,207 epoch 5 - iter 356/1786 - loss 0.05522861 - time (sec): 19.37 - samples/sec: 2660.03 - lr: 0.000032 - momentum: 0.000000
2023-10-25 14:57:09,051 epoch 5 - iter 534/1786 - loss 0.05116909 - time (sec): 29.21 - samples/sec: 2614.02 - lr: 0.000032 - momentum: 0.000000
2023-10-25 14:57:18,747 epoch 5 - iter 712/1786 - loss 0.05214688 - time (sec): 38.91 - samples/sec: 2573.04 - lr: 0.000031 - momentum: 0.000000
2023-10-25 14:57:28,444 epoch 5 - iter 890/1786 - loss 0.04958778 - time (sec): 48.60 - samples/sec: 2534.05 - lr: 0.000031 - momentum: 0.000000
2023-10-25 14:57:38,250 epoch 5 - iter 1068/1786 - loss 0.04777163 - time (sec): 58.41 - samples/sec: 2521.25 - lr: 0.000030 - momentum: 0.000000
2023-10-25 14:57:48,040 epoch 5 - iter 1246/1786 - loss 0.04818385 - time (sec): 68.20 - samples/sec: 2496.31 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:57:57,629 epoch 5 - iter 1424/1786 - loss 0.04857465 - time (sec): 77.79 - samples/sec: 2538.86 - lr: 0.000029 - momentum: 0.000000
2023-10-25 14:58:07,122 epoch 5 - iter 1602/1786 - loss 0.04773650 - time (sec): 87.28 - samples/sec: 2557.13 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:58:16,910 epoch 5 - iter 1780/1786 - loss 0.04709670 - time (sec): 97.07 - samples/sec: 2554.87 - lr: 0.000028 - momentum: 0.000000
2023-10-25 14:58:17,232 ----------------------------------------------------------------------------------------------------
2023-10-25 14:58:17,233 EPOCH 5 done: loss 0.0470 - lr: 0.000028
2023-10-25 14:58:21,821 DEV : loss 0.19018808007240295 - f1-score (micro avg) 0.7738
2023-10-25 14:58:21,845 saving best model
2023-10-25 14:58:24,127 ----------------------------------------------------------------------------------------------------
2023-10-25 14:58:33,822 epoch 6 - iter 178/1786 - loss 0.02826424 - time (sec): 9.69 - samples/sec: 2757.71 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:58:43,468 epoch 6 - iter 356/1786 - loss 0.03138743 - time (sec): 19.34 - samples/sec: 2673.84 - lr: 0.000027 - momentum: 0.000000
2023-10-25 14:58:52,936 epoch 6 - iter 534/1786 - loss 0.03399004 - time (sec): 28.81 - samples/sec: 2617.71 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:59:02,446 epoch 6 - iter 712/1786 - loss 0.03560434 - time (sec): 38.32 - samples/sec: 2650.36 - lr: 0.000026 - momentum: 0.000000
2023-10-25 14:59:11,997 epoch 6 - iter 890/1786 - loss 0.03582739 - time (sec): 47.87 - samples/sec: 2636.12 - lr: 0.000025 - momentum: 0.000000
2023-10-25 14:59:21,474 epoch 6 - iter 1068/1786 - loss 0.03714910 - time (sec): 57.35 - samples/sec: 2634.03 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:59:30,970 epoch 6 - iter 1246/1786 - loss 0.03684768 - time (sec): 66.84 - samples/sec: 2621.31 - lr: 0.000024 - momentum: 0.000000
2023-10-25 14:59:40,363 epoch 6 - iter 1424/1786 - loss 0.03615725 - time (sec): 76.23 - samples/sec: 2618.38 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:59:49,929 epoch 6 - iter 1602/1786 - loss 0.03647106 - time (sec): 85.80 - samples/sec: 2621.52 - lr: 0.000023 - momentum: 0.000000
2023-10-25 14:59:59,457 epoch 6 - iter 1780/1786 - loss 0.03647415 - time (sec): 95.33 - samples/sec: 2601.49 - lr: 0.000022 - momentum: 0.000000
2023-10-25 14:59:59,775 ----------------------------------------------------------------------------------------------------
2023-10-25 14:59:59,775 EPOCH 6 done: loss 0.0364 - lr: 0.000022
2023-10-25 15:00:03,677 DEV : loss 0.20759737491607666 - f1-score (micro avg) 0.7809
2023-10-25 15:00:03,700 saving best model
2023-10-25 15:00:04,371 ----------------------------------------------------------------------------------------------------
2023-10-25 15:00:13,891 epoch 7 - iter 178/1786 - loss 0.02523091 - time (sec): 9.52 - samples/sec: 2414.94 - lr: 0.000022 - momentum: 0.000000
2023-10-25 15:00:23,459 epoch 7 - iter 356/1786 - loss 0.02954901 - time (sec): 19.08 - samples/sec: 2508.47 - lr: 0.000021 - momentum: 0.000000
2023-10-25 15:00:33,286 epoch 7 - iter 534/1786 - loss 0.02745032 - time (sec): 28.91 - samples/sec: 2616.19 - lr: 0.000021 - momentum: 0.000000
2023-10-25 15:00:43,077 epoch 7 - iter 712/1786 - loss 0.02791815 - time (sec): 38.70 - samples/sec: 2635.23 - lr: 0.000020 - momentum: 0.000000
2023-10-25 15:00:52,413 epoch 7 - iter 890/1786 - loss 0.02773715 - time (sec): 48.04 - samples/sec: 2590.73 - lr: 0.000019 - momentum: 0.000000
2023-10-25 15:01:01,227 epoch 7 - iter 1068/1786 - loss 0.02749024 - time (sec): 56.85 - samples/sec: 2611.10 - lr: 0.000019 - momentum: 0.000000
2023-10-25 15:01:09,845 epoch 7 - iter 1246/1786 - loss 0.02786391 - time (sec): 65.47 - samples/sec: 2648.12 - lr: 0.000018 - momentum: 0.000000
2023-10-25 15:01:18,505 epoch 7 - iter 1424/1786 - loss 0.02768994 - time (sec): 74.13 - samples/sec: 2674.88 - lr: 0.000018 - momentum: 0.000000
2023-10-25 15:01:27,450 epoch 7 - iter 1602/1786 - loss 0.02880790 - time (sec): 83.08 - samples/sec: 2677.87 - lr: 0.000017 - momentum: 0.000000
2023-10-25 15:01:36,479 epoch 7 - iter 1780/1786 - loss 0.02861625 - time (sec): 92.10 - samples/sec: 2693.30 - lr: 0.000017 - momentum: 0.000000
2023-10-25 15:01:36,780 ----------------------------------------------------------------------------------------------------
2023-10-25 15:01:36,781 EPOCH 7 done: loss 0.0287 - lr: 0.000017
2023-10-25 15:01:41,727 DEV : loss 0.210079625248909 - f1-score (micro avg) 0.7967
2023-10-25 15:01:41,752 saving best model
2023-10-25 15:01:42,460 ----------------------------------------------------------------------------------------------------
2023-10-25 15:01:51,741 epoch 8 - iter 178/1786 - loss 0.01706992 - time (sec): 9.28 - samples/sec: 2667.12 - lr: 0.000016 - momentum: 0.000000
2023-10-25 15:02:00,894 epoch 8 - iter 356/1786 - loss 0.01935111 - time (sec): 18.43 - samples/sec: 2782.76 - lr: 0.000016 - momentum: 0.000000
2023-10-25 15:02:10,126 epoch 8 - iter 534/1786 - loss 0.01874463 - time (sec): 27.66 - samples/sec: 2787.37 - lr: 0.000015 - momentum: 0.000000
2023-10-25 15:02:19,255 epoch 8 - iter 712/1786 - loss 0.01879846 - time (sec): 36.79 - samples/sec: 2798.76 - lr: 0.000014 - momentum: 0.000000
2023-10-25 15:02:28,045 epoch 8 - iter 890/1786 - loss 0.01977016 - time (sec): 45.58 - samples/sec: 2769.63 - lr: 0.000014 - momentum: 0.000000
2023-10-25 15:02:37,385 epoch 8 - iter 1068/1786 - loss 0.02060995 - time (sec): 54.92 - samples/sec: 2734.37 - lr: 0.000013 - momentum: 0.000000
2023-10-25 15:02:46,579 epoch 8 - iter 1246/1786 - loss 0.01992327 - time (sec): 64.12 - samples/sec: 2729.82 - lr: 0.000013 - momentum: 0.000000
2023-10-25 15:02:55,628 epoch 8 - iter 1424/1786 - loss 0.01930754 - time (sec): 73.17 - samples/sec: 2730.50 - lr: 0.000012 - momentum: 0.000000
2023-10-25 15:03:04,526 epoch 8 - iter 1602/1786 - loss 0.02003035 - time (sec): 82.06 - samples/sec: 2737.89 - lr: 0.000012 - momentum: 0.000000
2023-10-25 15:03:13,732 epoch 8 - iter 1780/1786 - loss 0.01945620 - time (sec): 91.27 - samples/sec: 2717.14 - lr: 0.000011 - momentum: 0.000000
2023-10-25 15:03:14,036 ----------------------------------------------------------------------------------------------------
2023-10-25 15:03:14,037 EPOCH 8 done: loss 0.0194 - lr: 0.000011
2023-10-25 15:03:17,868 DEV : loss 0.24006861448287964 - f1-score (micro avg) 0.8065
2023-10-25 15:03:17,891 saving best model
2023-10-25 15:03:18,584 ----------------------------------------------------------------------------------------------------
2023-10-25 15:03:28,381 epoch 9 - iter 178/1786 - loss 0.01966686 - time (sec): 9.79 - samples/sec: 2497.75 - lr: 0.000011 - momentum: 0.000000
2023-10-25 15:03:38,805 epoch 9 - iter 356/1786 - loss 0.01914487 - time (sec): 20.22 - samples/sec: 2486.87 - lr: 0.000010 - momentum: 0.000000
2023-10-25 15:03:48,480 epoch 9 - iter 534/1786 - loss 0.01935434 - time (sec): 29.89 - samples/sec: 2548.37 - lr: 0.000009 - momentum: 0.000000
2023-10-25 15:03:58,187 epoch 9 - iter 712/1786 - loss 0.02027442 - time (sec): 39.60 - samples/sec: 2527.51 - lr: 0.000009 - momentum: 0.000000
2023-10-25 15:04:07,497 epoch 9 - iter 890/1786 - loss 0.01881800 - time (sec): 48.91 - samples/sec: 2574.08 - lr: 0.000008 - momentum: 0.000000
2023-10-25 15:04:16,574 epoch 9 - iter 1068/1786 - loss 0.01810846 - time (sec): 57.99 - samples/sec: 2592.00 - lr: 0.000008 - momentum: 0.000000
2023-10-25 15:04:25,884 epoch 9 - iter 1246/1786 - loss 0.01803578 - time (sec): 67.30 - samples/sec: 2607.25 - lr: 0.000007 - momentum: 0.000000
2023-10-25 15:04:36,028 epoch 9 - iter 1424/1786 - loss 0.01811109 - time (sec): 77.44 - samples/sec: 2574.70 - lr: 0.000007 - momentum: 0.000000
2023-10-25 15:04:45,582 epoch 9 - iter 1602/1786 - loss 0.01892874 - time (sec): 87.00 - samples/sec: 2579.09 - lr: 0.000006 - momentum: 0.000000
2023-10-25 15:04:55,055 epoch 9 - iter 1780/1786 - loss 0.02267831 - time (sec): 96.47 - samples/sec: 2571.05 - lr: 0.000006 - momentum: 0.000000
2023-10-25 15:04:55,366 ----------------------------------------------------------------------------------------------------
2023-10-25 15:04:55,367 EPOCH 9 done: loss 0.0230 - lr: 0.000006
2023-10-25 15:04:59,142 DEV : loss 0.25254154205322266 - f1-score (micro avg) 0.6079
2023-10-25 15:04:59,166 ----------------------------------------------------------------------------------------------------
2023-10-25 15:05:08,791 epoch 10 - iter 178/1786 - loss 0.11012665 - time (sec): 9.62 - samples/sec: 2684.65 - lr: 0.000005 - momentum: 0.000000
2023-10-25 15:05:17,986 epoch 10 - iter 356/1786 - loss 0.07275891 - time (sec): 18.82 - samples/sec: 2728.54 - lr: 0.000004 - momentum: 0.000000
2023-10-25 15:05:26,833 epoch 10 - iter 534/1786 - loss 0.06185256 - time (sec): 27.66 - samples/sec: 2718.84 - lr: 0.000004 - momentum: 0.000000
2023-10-25 15:05:35,495 epoch 10 - iter 712/1786 - loss 0.06154437 - time (sec): 36.33 - samples/sec: 2766.27 - lr: 0.000003 - momentum: 0.000000
2023-10-25 15:05:44,207 epoch 10 - iter 890/1786 - loss 0.06553764 - time (sec): 45.04 - samples/sec: 2755.02 - lr: 0.000003 - momentum: 0.000000
2023-10-25 15:05:52,947 epoch 10 - iter 1068/1786 - loss 0.06476228 - time (sec): 53.78 - samples/sec: 2753.92 - lr: 0.000002 - momentum: 0.000000
2023-10-25 15:06:01,545 epoch 10 - iter 1246/1786 - loss 0.06251026 - time (sec): 62.38 - samples/sec: 2780.55 - lr: 0.000002 - momentum: 0.000000
2023-10-25 15:06:10,626 epoch 10 - iter 1424/1786 - loss 0.06067494 - time (sec): 71.46 - samples/sec: 2783.86 - lr: 0.000001 - momentum: 0.000000
2023-10-25 15:06:19,695 epoch 10 - iter 1602/1786 - loss 0.05860883 - time (sec): 80.53 - samples/sec: 2772.55 - lr: 0.000001 - momentum: 0.000000
2023-10-25 15:06:29,012 epoch 10 - iter 1780/1786 - loss 0.05709972 - time (sec): 89.84 - samples/sec: 2758.02 - lr: 0.000000 - momentum: 0.000000
2023-10-25 15:06:29,337 ----------------------------------------------------------------------------------------------------
2023-10-25 15:06:29,338 EPOCH 10 done: loss 0.0570 - lr: 0.000000
2023-10-25 15:06:34,225 DEV : loss 0.2218979001045227 - f1-score (micro avg) 0.6822
2023-10-25 15:06:34,730 ----------------------------------------------------------------------------------------------------
2023-10-25 15:06:34,731 Loading model from best epoch ...
2023-10-25 15:06:36,683 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-25 15:06:48,731
Results:
- F-score (micro) 0.6719
- F-score (macro) 0.5777
- Accuracy 0.5222
By class:
precision recall f1-score support
LOC 0.6930 0.6721 0.6824 1095
PER 0.7336 0.7510 0.7422 1012
ORG 0.4452 0.5350 0.4860 357
HumanProd 0.3571 0.4545 0.4000 33
micro avg 0.6625 0.6816 0.6719 2497
macro avg 0.5572 0.6032 0.5777 2497
weighted avg 0.6696 0.6816 0.6748 2497
2023-10-25 15:06:48,732 ----------------------------------------------------------------------------------------------------