stefan-it's picture
Upload folder using huggingface_hub
17f470c
2023-10-16 20:01:57,367 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:57,368 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 20:01:57,368 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:57,369 MultiCorpus: 1085 train + 148 dev + 364 test sentences
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator
2023-10-16 20:01:57,369 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:57,369 Train: 1085 sentences
2023-10-16 20:01:57,369 (train_with_dev=False, train_with_test=False)
2023-10-16 20:01:57,369 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:57,369 Training Params:
2023-10-16 20:01:57,369 - learning_rate: "3e-05"
2023-10-16 20:01:57,369 - mini_batch_size: "8"
2023-10-16 20:01:57,369 - max_epochs: "10"
2023-10-16 20:01:57,369 - shuffle: "True"
2023-10-16 20:01:57,369 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:57,369 Plugins:
2023-10-16 20:01:57,369 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 20:01:57,369 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:57,369 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 20:01:57,369 - metric: "('micro avg', 'f1-score')"
2023-10-16 20:01:57,369 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:57,369 Computation:
2023-10-16 20:01:57,369 - compute on device: cuda:0
2023-10-16 20:01:57,369 - embedding storage: none
2023-10-16 20:01:57,369 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:57,369 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-16 20:01:57,369 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:57,369 ----------------------------------------------------------------------------------------------------
2023-10-16 20:01:58,678 epoch 1 - iter 13/136 - loss 2.91442836 - time (sec): 1.31 - samples/sec: 3636.18 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:01:59,935 epoch 1 - iter 26/136 - loss 2.69069462 - time (sec): 2.56 - samples/sec: 3852.53 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:02:01,337 epoch 1 - iter 39/136 - loss 2.32855989 - time (sec): 3.97 - samples/sec: 3715.50 - lr: 0.000008 - momentum: 0.000000
2023-10-16 20:02:02,590 epoch 1 - iter 52/136 - loss 1.91332667 - time (sec): 5.22 - samples/sec: 3735.52 - lr: 0.000011 - momentum: 0.000000
2023-10-16 20:02:03,721 epoch 1 - iter 65/136 - loss 1.68190095 - time (sec): 6.35 - samples/sec: 3692.83 - lr: 0.000014 - momentum: 0.000000
2023-10-16 20:02:05,042 epoch 1 - iter 78/136 - loss 1.47476987 - time (sec): 7.67 - samples/sec: 3724.25 - lr: 0.000017 - momentum: 0.000000
2023-10-16 20:02:06,487 epoch 1 - iter 91/136 - loss 1.31930849 - time (sec): 9.12 - samples/sec: 3665.83 - lr: 0.000020 - momentum: 0.000000
2023-10-16 20:02:07,895 epoch 1 - iter 104/136 - loss 1.19645331 - time (sec): 10.53 - samples/sec: 3656.78 - lr: 0.000023 - momentum: 0.000000
2023-10-16 20:02:09,164 epoch 1 - iter 117/136 - loss 1.10784598 - time (sec): 11.79 - samples/sec: 3654.67 - lr: 0.000026 - momentum: 0.000000
2023-10-16 20:02:10,730 epoch 1 - iter 130/136 - loss 1.01171940 - time (sec): 13.36 - samples/sec: 3643.06 - lr: 0.000028 - momentum: 0.000000
2023-10-16 20:02:11,553 ----------------------------------------------------------------------------------------------------
2023-10-16 20:02:11,553 EPOCH 1 done: loss 0.9581 - lr: 0.000028
2023-10-16 20:02:12,404 DEV : loss 0.20712290704250336 - f1-score (micro avg) 0.6279
2023-10-16 20:02:12,408 saving best model
2023-10-16 20:02:12,750 ----------------------------------------------------------------------------------------------------
2023-10-16 20:02:14,373 epoch 2 - iter 13/136 - loss 0.20196333 - time (sec): 1.62 - samples/sec: 3593.92 - lr: 0.000030 - momentum: 0.000000
2023-10-16 20:02:15,730 epoch 2 - iter 26/136 - loss 0.23182921 - time (sec): 2.98 - samples/sec: 3751.59 - lr: 0.000029 - momentum: 0.000000
2023-10-16 20:02:17,132 epoch 2 - iter 39/136 - loss 0.22220376 - time (sec): 4.38 - samples/sec: 3605.53 - lr: 0.000029 - momentum: 0.000000
2023-10-16 20:02:18,487 epoch 2 - iter 52/136 - loss 0.21355988 - time (sec): 5.74 - samples/sec: 3621.88 - lr: 0.000029 - momentum: 0.000000
2023-10-16 20:02:19,801 epoch 2 - iter 65/136 - loss 0.20274936 - time (sec): 7.05 - samples/sec: 3582.27 - lr: 0.000028 - momentum: 0.000000
2023-10-16 20:02:21,200 epoch 2 - iter 78/136 - loss 0.20329918 - time (sec): 8.45 - samples/sec: 3590.76 - lr: 0.000028 - momentum: 0.000000
2023-10-16 20:02:22,649 epoch 2 - iter 91/136 - loss 0.19522258 - time (sec): 9.90 - samples/sec: 3569.48 - lr: 0.000028 - momentum: 0.000000
2023-10-16 20:02:24,045 epoch 2 - iter 104/136 - loss 0.18826517 - time (sec): 11.29 - samples/sec: 3576.97 - lr: 0.000027 - momentum: 0.000000
2023-10-16 20:02:25,318 epoch 2 - iter 117/136 - loss 0.18486750 - time (sec): 12.57 - samples/sec: 3563.05 - lr: 0.000027 - momentum: 0.000000
2023-10-16 20:02:26,795 epoch 2 - iter 130/136 - loss 0.17895143 - time (sec): 14.04 - samples/sec: 3539.43 - lr: 0.000027 - momentum: 0.000000
2023-10-16 20:02:27,497 ----------------------------------------------------------------------------------------------------
2023-10-16 20:02:27,497 EPOCH 2 done: loss 0.1777 - lr: 0.000027
2023-10-16 20:02:28,920 DEV : loss 0.13420191407203674 - f1-score (micro avg) 0.7263
2023-10-16 20:02:28,924 saving best model
2023-10-16 20:02:29,347 ----------------------------------------------------------------------------------------------------
2023-10-16 20:02:30,481 epoch 3 - iter 13/136 - loss 0.11409157 - time (sec): 1.13 - samples/sec: 3933.71 - lr: 0.000026 - momentum: 0.000000
2023-10-16 20:02:32,020 epoch 3 - iter 26/136 - loss 0.11467445 - time (sec): 2.67 - samples/sec: 3836.28 - lr: 0.000026 - momentum: 0.000000
2023-10-16 20:02:33,455 epoch 3 - iter 39/136 - loss 0.12145447 - time (sec): 4.10 - samples/sec: 3796.47 - lr: 0.000026 - momentum: 0.000000
2023-10-16 20:02:34,766 epoch 3 - iter 52/136 - loss 0.11746190 - time (sec): 5.41 - samples/sec: 3876.82 - lr: 0.000025 - momentum: 0.000000
2023-10-16 20:02:36,111 epoch 3 - iter 65/136 - loss 0.11099876 - time (sec): 6.76 - samples/sec: 3780.28 - lr: 0.000025 - momentum: 0.000000
2023-10-16 20:02:37,658 epoch 3 - iter 78/136 - loss 0.10964286 - time (sec): 8.31 - samples/sec: 3760.78 - lr: 0.000025 - momentum: 0.000000
2023-10-16 20:02:39,172 epoch 3 - iter 91/136 - loss 0.10854662 - time (sec): 9.82 - samples/sec: 3699.11 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:02:40,662 epoch 3 - iter 104/136 - loss 0.10553309 - time (sec): 11.31 - samples/sec: 3665.93 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:02:41,840 epoch 3 - iter 117/136 - loss 0.10386330 - time (sec): 12.49 - samples/sec: 3659.10 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:02:43,015 epoch 3 - iter 130/136 - loss 0.10556967 - time (sec): 13.66 - samples/sec: 3628.70 - lr: 0.000024 - momentum: 0.000000
2023-10-16 20:02:43,601 ----------------------------------------------------------------------------------------------------
2023-10-16 20:02:43,601 EPOCH 3 done: loss 0.1048 - lr: 0.000024
2023-10-16 20:02:45,029 DEV : loss 0.10961901396512985 - f1-score (micro avg) 0.7672
2023-10-16 20:02:45,033 saving best model
2023-10-16 20:02:45,458 ----------------------------------------------------------------------------------------------------
2023-10-16 20:02:47,074 epoch 4 - iter 13/136 - loss 0.05414508 - time (sec): 1.61 - samples/sec: 3350.43 - lr: 0.000023 - momentum: 0.000000
2023-10-16 20:02:48,693 epoch 4 - iter 26/136 - loss 0.06020060 - time (sec): 3.23 - samples/sec: 3409.78 - lr: 0.000023 - momentum: 0.000000
2023-10-16 20:02:50,027 epoch 4 - iter 39/136 - loss 0.06338423 - time (sec): 4.56 - samples/sec: 3491.64 - lr: 0.000022 - momentum: 0.000000
2023-10-16 20:02:51,356 epoch 4 - iter 52/136 - loss 0.06382841 - time (sec): 5.89 - samples/sec: 3498.76 - lr: 0.000022 - momentum: 0.000000
2023-10-16 20:02:52,585 epoch 4 - iter 65/136 - loss 0.07143233 - time (sec): 7.12 - samples/sec: 3559.23 - lr: 0.000022 - momentum: 0.000000
2023-10-16 20:02:53,986 epoch 4 - iter 78/136 - loss 0.06767888 - time (sec): 8.52 - samples/sec: 3533.34 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:02:55,338 epoch 4 - iter 91/136 - loss 0.06510267 - time (sec): 9.88 - samples/sec: 3553.60 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:02:56,857 epoch 4 - iter 104/136 - loss 0.06421920 - time (sec): 11.39 - samples/sec: 3500.01 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:02:58,044 epoch 4 - iter 117/136 - loss 0.06376163 - time (sec): 12.58 - samples/sec: 3506.64 - lr: 0.000021 - momentum: 0.000000
2023-10-16 20:02:59,421 epoch 4 - iter 130/136 - loss 0.06343641 - time (sec): 13.96 - samples/sec: 3550.82 - lr: 0.000020 - momentum: 0.000000
2023-10-16 20:03:00,051 ----------------------------------------------------------------------------------------------------
2023-10-16 20:03:00,052 EPOCH 4 done: loss 0.0623 - lr: 0.000020
2023-10-16 20:03:01,483 DEV : loss 0.1094428151845932 - f1-score (micro avg) 0.8187
2023-10-16 20:03:01,487 saving best model
2023-10-16 20:03:01,929 ----------------------------------------------------------------------------------------------------
2023-10-16 20:03:03,213 epoch 5 - iter 13/136 - loss 0.06363633 - time (sec): 1.28 - samples/sec: 3695.42 - lr: 0.000020 - momentum: 0.000000
2023-10-16 20:03:04,495 epoch 5 - iter 26/136 - loss 0.05077318 - time (sec): 2.56 - samples/sec: 3710.79 - lr: 0.000019 - momentum: 0.000000
2023-10-16 20:03:05,702 epoch 5 - iter 39/136 - loss 0.04917039 - time (sec): 3.77 - samples/sec: 3710.99 - lr: 0.000019 - momentum: 0.000000
2023-10-16 20:03:06,976 epoch 5 - iter 52/136 - loss 0.04742205 - time (sec): 5.04 - samples/sec: 3746.79 - lr: 0.000019 - momentum: 0.000000
2023-10-16 20:03:08,361 epoch 5 - iter 65/136 - loss 0.04702606 - time (sec): 6.43 - samples/sec: 3662.68 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:03:09,736 epoch 5 - iter 78/136 - loss 0.04392786 - time (sec): 7.80 - samples/sec: 3647.38 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:03:11,387 epoch 5 - iter 91/136 - loss 0.04239086 - time (sec): 9.45 - samples/sec: 3597.76 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:03:13,140 epoch 5 - iter 104/136 - loss 0.04275403 - time (sec): 11.21 - samples/sec: 3586.59 - lr: 0.000018 - momentum: 0.000000
2023-10-16 20:03:14,192 epoch 5 - iter 117/136 - loss 0.04101997 - time (sec): 12.26 - samples/sec: 3623.62 - lr: 0.000017 - momentum: 0.000000
2023-10-16 20:03:15,609 epoch 5 - iter 130/136 - loss 0.04182635 - time (sec): 13.67 - samples/sec: 3580.89 - lr: 0.000017 - momentum: 0.000000
2023-10-16 20:03:16,433 ----------------------------------------------------------------------------------------------------
2023-10-16 20:03:16,433 EPOCH 5 done: loss 0.0407 - lr: 0.000017
2023-10-16 20:03:17,863 DEV : loss 0.12483629584312439 - f1-score (micro avg) 0.794
2023-10-16 20:03:17,867 ----------------------------------------------------------------------------------------------------
2023-10-16 20:03:19,485 epoch 6 - iter 13/136 - loss 0.02092780 - time (sec): 1.62 - samples/sec: 3434.68 - lr: 0.000016 - momentum: 0.000000
2023-10-16 20:03:20,683 epoch 6 - iter 26/136 - loss 0.02259711 - time (sec): 2.81 - samples/sec: 3319.47 - lr: 0.000016 - momentum: 0.000000
2023-10-16 20:03:22,007 epoch 6 - iter 39/136 - loss 0.02699121 - time (sec): 4.14 - samples/sec: 3404.15 - lr: 0.000016 - momentum: 0.000000
2023-10-16 20:03:23,271 epoch 6 - iter 52/136 - loss 0.02802062 - time (sec): 5.40 - samples/sec: 3448.45 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:03:24,305 epoch 6 - iter 65/136 - loss 0.02827787 - time (sec): 6.44 - samples/sec: 3593.52 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:03:25,825 epoch 6 - iter 78/136 - loss 0.02794861 - time (sec): 7.96 - samples/sec: 3520.88 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:03:27,266 epoch 6 - iter 91/136 - loss 0.02931078 - time (sec): 9.40 - samples/sec: 3520.20 - lr: 0.000015 - momentum: 0.000000
2023-10-16 20:03:28,867 epoch 6 - iter 104/136 - loss 0.02865060 - time (sec): 11.00 - samples/sec: 3517.87 - lr: 0.000014 - momentum: 0.000000
2023-10-16 20:03:30,521 epoch 6 - iter 117/136 - loss 0.02868558 - time (sec): 12.65 - samples/sec: 3547.09 - lr: 0.000014 - momentum: 0.000000
2023-10-16 20:03:31,871 epoch 6 - iter 130/136 - loss 0.02845386 - time (sec): 14.00 - samples/sec: 3569.48 - lr: 0.000014 - momentum: 0.000000
2023-10-16 20:03:32,577 ----------------------------------------------------------------------------------------------------
2023-10-16 20:03:32,577 EPOCH 6 done: loss 0.0286 - lr: 0.000014
2023-10-16 20:03:34,007 DEV : loss 0.12027047574520111 - f1-score (micro avg) 0.8112
2023-10-16 20:03:34,012 ----------------------------------------------------------------------------------------------------
2023-10-16 20:03:35,433 epoch 7 - iter 13/136 - loss 0.01632546 - time (sec): 1.42 - samples/sec: 3589.72 - lr: 0.000013 - momentum: 0.000000
2023-10-16 20:03:36,990 epoch 7 - iter 26/136 - loss 0.02131926 - time (sec): 2.98 - samples/sec: 3686.52 - lr: 0.000013 - momentum: 0.000000
2023-10-16 20:03:38,300 epoch 7 - iter 39/136 - loss 0.02233165 - time (sec): 4.29 - samples/sec: 3694.17 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:03:39,653 epoch 7 - iter 52/136 - loss 0.02309467 - time (sec): 5.64 - samples/sec: 3711.21 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:03:40,980 epoch 7 - iter 65/136 - loss 0.02166622 - time (sec): 6.97 - samples/sec: 3722.57 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:03:42,415 epoch 7 - iter 78/136 - loss 0.02288074 - time (sec): 8.40 - samples/sec: 3673.78 - lr: 0.000012 - momentum: 0.000000
2023-10-16 20:03:43,750 epoch 7 - iter 91/136 - loss 0.02251523 - time (sec): 9.74 - samples/sec: 3611.44 - lr: 0.000011 - momentum: 0.000000
2023-10-16 20:03:45,246 epoch 7 - iter 104/136 - loss 0.02114757 - time (sec): 11.23 - samples/sec: 3653.68 - lr: 0.000011 - momentum: 0.000000
2023-10-16 20:03:46,560 epoch 7 - iter 117/136 - loss 0.02095332 - time (sec): 12.55 - samples/sec: 3646.32 - lr: 0.000011 - momentum: 0.000000
2023-10-16 20:03:47,802 epoch 7 - iter 130/136 - loss 0.02058114 - time (sec): 13.79 - samples/sec: 3624.00 - lr: 0.000010 - momentum: 0.000000
2023-10-16 20:03:48,452 ----------------------------------------------------------------------------------------------------
2023-10-16 20:03:48,452 EPOCH 7 done: loss 0.0219 - lr: 0.000010
2023-10-16 20:03:49,879 DEV : loss 0.14382445812225342 - f1-score (micro avg) 0.8089
2023-10-16 20:03:49,883 ----------------------------------------------------------------------------------------------------
2023-10-16 20:03:51,220 epoch 8 - iter 13/136 - loss 0.01528701 - time (sec): 1.34 - samples/sec: 3482.08 - lr: 0.000010 - momentum: 0.000000
2023-10-16 20:03:52,699 epoch 8 - iter 26/136 - loss 0.01188366 - time (sec): 2.82 - samples/sec: 3592.11 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:03:54,251 epoch 8 - iter 39/136 - loss 0.01575956 - time (sec): 4.37 - samples/sec: 3703.30 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:03:55,786 epoch 8 - iter 52/136 - loss 0.01481007 - time (sec): 5.90 - samples/sec: 3622.01 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:03:57,031 epoch 8 - iter 65/136 - loss 0.01596107 - time (sec): 7.15 - samples/sec: 3742.18 - lr: 0.000009 - momentum: 0.000000
2023-10-16 20:03:58,487 epoch 8 - iter 78/136 - loss 0.01548639 - time (sec): 8.60 - samples/sec: 3649.16 - lr: 0.000008 - momentum: 0.000000
2023-10-16 20:04:00,092 epoch 8 - iter 91/136 - loss 0.01463137 - time (sec): 10.21 - samples/sec: 3550.79 - lr: 0.000008 - momentum: 0.000000
2023-10-16 20:04:01,164 epoch 8 - iter 104/136 - loss 0.01552289 - time (sec): 11.28 - samples/sec: 3588.90 - lr: 0.000008 - momentum: 0.000000
2023-10-16 20:04:02,262 epoch 8 - iter 117/136 - loss 0.01605070 - time (sec): 12.38 - samples/sec: 3588.30 - lr: 0.000007 - momentum: 0.000000
2023-10-16 20:04:03,810 epoch 8 - iter 130/136 - loss 0.01578008 - time (sec): 13.93 - samples/sec: 3580.13 - lr: 0.000007 - momentum: 0.000000
2023-10-16 20:04:04,504 ----------------------------------------------------------------------------------------------------
2023-10-16 20:04:04,504 EPOCH 8 done: loss 0.0159 - lr: 0.000007
2023-10-16 20:04:05,930 DEV : loss 0.15526820719242096 - f1-score (micro avg) 0.8007
2023-10-16 20:04:05,934 ----------------------------------------------------------------------------------------------------
2023-10-16 20:04:07,279 epoch 9 - iter 13/136 - loss 0.02365417 - time (sec): 1.34 - samples/sec: 3545.27 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:04:08,970 epoch 9 - iter 26/136 - loss 0.01767866 - time (sec): 3.03 - samples/sec: 3431.24 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:04:10,268 epoch 9 - iter 39/136 - loss 0.01494692 - time (sec): 4.33 - samples/sec: 3529.52 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:04:11,536 epoch 9 - iter 52/136 - loss 0.01303926 - time (sec): 5.60 - samples/sec: 3691.23 - lr: 0.000006 - momentum: 0.000000
2023-10-16 20:04:12,742 epoch 9 - iter 65/136 - loss 0.01580492 - time (sec): 6.81 - samples/sec: 3617.65 - lr: 0.000005 - momentum: 0.000000
2023-10-16 20:04:14,120 epoch 9 - iter 78/136 - loss 0.01530890 - time (sec): 8.19 - samples/sec: 3695.01 - lr: 0.000005 - momentum: 0.000000
2023-10-16 20:04:15,422 epoch 9 - iter 91/136 - loss 0.01439618 - time (sec): 9.49 - samples/sec: 3653.04 - lr: 0.000005 - momentum: 0.000000
2023-10-16 20:04:16,551 epoch 9 - iter 104/136 - loss 0.01371922 - time (sec): 10.62 - samples/sec: 3651.04 - lr: 0.000004 - momentum: 0.000000
2023-10-16 20:04:17,991 epoch 9 - iter 117/136 - loss 0.01367269 - time (sec): 12.06 - samples/sec: 3649.94 - lr: 0.000004 - momentum: 0.000000
2023-10-16 20:04:19,478 epoch 9 - iter 130/136 - loss 0.01301378 - time (sec): 13.54 - samples/sec: 3640.90 - lr: 0.000004 - momentum: 0.000000
2023-10-16 20:04:20,266 ----------------------------------------------------------------------------------------------------
2023-10-16 20:04:20,266 EPOCH 9 done: loss 0.0133 - lr: 0.000004
2023-10-16 20:04:21,692 DEV : loss 0.15417590737342834 - f1-score (micro avg) 0.8082
2023-10-16 20:04:21,697 ----------------------------------------------------------------------------------------------------
2023-10-16 20:04:23,167 epoch 10 - iter 13/136 - loss 0.01071197 - time (sec): 1.47 - samples/sec: 3832.99 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:04:24,252 epoch 10 - iter 26/136 - loss 0.00816220 - time (sec): 2.55 - samples/sec: 3810.88 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:04:25,562 epoch 10 - iter 39/136 - loss 0.00707103 - time (sec): 3.86 - samples/sec: 3709.08 - lr: 0.000003 - momentum: 0.000000
2023-10-16 20:04:27,016 epoch 10 - iter 52/136 - loss 0.00747435 - time (sec): 5.32 - samples/sec: 3696.36 - lr: 0.000002 - momentum: 0.000000
2023-10-16 20:04:28,456 epoch 10 - iter 65/136 - loss 0.00708783 - time (sec): 6.76 - samples/sec: 3709.21 - lr: 0.000002 - momentum: 0.000000
2023-10-16 20:04:29,860 epoch 10 - iter 78/136 - loss 0.00819418 - time (sec): 8.16 - samples/sec: 3695.83 - lr: 0.000002 - momentum: 0.000000
2023-10-16 20:04:31,742 epoch 10 - iter 91/136 - loss 0.00924199 - time (sec): 10.04 - samples/sec: 3570.64 - lr: 0.000001 - momentum: 0.000000
2023-10-16 20:04:32,971 epoch 10 - iter 104/136 - loss 0.00937275 - time (sec): 11.27 - samples/sec: 3561.47 - lr: 0.000001 - momentum: 0.000000
2023-10-16 20:04:34,164 epoch 10 - iter 117/136 - loss 0.00996494 - time (sec): 12.47 - samples/sec: 3555.82 - lr: 0.000001 - momentum: 0.000000
2023-10-16 20:04:35,710 epoch 10 - iter 130/136 - loss 0.01051771 - time (sec): 14.01 - samples/sec: 3552.06 - lr: 0.000000 - momentum: 0.000000
2023-10-16 20:04:36,337 ----------------------------------------------------------------------------------------------------
2023-10-16 20:04:36,337 EPOCH 10 done: loss 0.0106 - lr: 0.000000
2023-10-16 20:04:37,766 DEV : loss 0.15838788449764252 - f1-score (micro avg) 0.797
2023-10-16 20:04:38,105 ----------------------------------------------------------------------------------------------------
2023-10-16 20:04:38,106 Loading model from best epoch ...
2023-10-16 20:04:39,473 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-16 20:04:41,439
Results:
- F-score (micro) 0.7623
- F-score (macro) 0.701
- Accuracy 0.6307
By class:
precision recall f1-score support
LOC 0.7821 0.8397 0.8099 312
PER 0.6944 0.8413 0.7609 208
ORG 0.4848 0.2909 0.3636 55
HumanProd 0.8333 0.9091 0.8696 22
micro avg 0.7345 0.7923 0.7623 597
macro avg 0.6987 0.7203 0.7010 597
weighted avg 0.7261 0.7923 0.7539 597
2023-10-16 20:04:41,439 ----------------------------------------------------------------------------------------------------