|
2023-10-16 20:01:57,367 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:57,368 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 20:01:57,368 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:57,369 MultiCorpus: 1085 train + 148 dev + 364 test sentences |
|
- NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator |
|
2023-10-16 20:01:57,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:57,369 Train: 1085 sentences |
|
2023-10-16 20:01:57,369 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 20:01:57,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:57,369 Training Params: |
|
2023-10-16 20:01:57,369 - learning_rate: "3e-05" |
|
2023-10-16 20:01:57,369 - mini_batch_size: "8" |
|
2023-10-16 20:01:57,369 - max_epochs: "10" |
|
2023-10-16 20:01:57,369 - shuffle: "True" |
|
2023-10-16 20:01:57,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:57,369 Plugins: |
|
2023-10-16 20:01:57,369 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 20:01:57,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:57,369 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 20:01:57,369 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 20:01:57,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:57,369 Computation: |
|
2023-10-16 20:01:57,369 - compute on device: cuda:0 |
|
2023-10-16 20:01:57,369 - embedding storage: none |
|
2023-10-16 20:01:57,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:57,369 Model training base path: "hmbench-newseye/sv-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" |
|
2023-10-16 20:01:57,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:57,369 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:01:58,678 epoch 1 - iter 13/136 - loss 2.91442836 - time (sec): 1.31 - samples/sec: 3636.18 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:01:59,935 epoch 1 - iter 26/136 - loss 2.69069462 - time (sec): 2.56 - samples/sec: 3852.53 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:02:01,337 epoch 1 - iter 39/136 - loss 2.32855989 - time (sec): 3.97 - samples/sec: 3715.50 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:02:02,590 epoch 1 - iter 52/136 - loss 1.91332667 - time (sec): 5.22 - samples/sec: 3735.52 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:02:03,721 epoch 1 - iter 65/136 - loss 1.68190095 - time (sec): 6.35 - samples/sec: 3692.83 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:02:05,042 epoch 1 - iter 78/136 - loss 1.47476987 - time (sec): 7.67 - samples/sec: 3724.25 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:02:06,487 epoch 1 - iter 91/136 - loss 1.31930849 - time (sec): 9.12 - samples/sec: 3665.83 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:02:07,895 epoch 1 - iter 104/136 - loss 1.19645331 - time (sec): 10.53 - samples/sec: 3656.78 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:02:09,164 epoch 1 - iter 117/136 - loss 1.10784598 - time (sec): 11.79 - samples/sec: 3654.67 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:02:10,730 epoch 1 - iter 130/136 - loss 1.01171940 - time (sec): 13.36 - samples/sec: 3643.06 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:02:11,553 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:02:11,553 EPOCH 1 done: loss 0.9581 - lr: 0.000028 |
|
2023-10-16 20:02:12,404 DEV : loss 0.20712290704250336 - f1-score (micro avg) 0.6279 |
|
2023-10-16 20:02:12,408 saving best model |
|
2023-10-16 20:02:12,750 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:02:14,373 epoch 2 - iter 13/136 - loss 0.20196333 - time (sec): 1.62 - samples/sec: 3593.92 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 20:02:15,730 epoch 2 - iter 26/136 - loss 0.23182921 - time (sec): 2.98 - samples/sec: 3751.59 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:02:17,132 epoch 2 - iter 39/136 - loss 0.22220376 - time (sec): 4.38 - samples/sec: 3605.53 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:02:18,487 epoch 2 - iter 52/136 - loss 0.21355988 - time (sec): 5.74 - samples/sec: 3621.88 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 20:02:19,801 epoch 2 - iter 65/136 - loss 0.20274936 - time (sec): 7.05 - samples/sec: 3582.27 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:02:21,200 epoch 2 - iter 78/136 - loss 0.20329918 - time (sec): 8.45 - samples/sec: 3590.76 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:02:22,649 epoch 2 - iter 91/136 - loss 0.19522258 - time (sec): 9.90 - samples/sec: 3569.48 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 20:02:24,045 epoch 2 - iter 104/136 - loss 0.18826517 - time (sec): 11.29 - samples/sec: 3576.97 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:02:25,318 epoch 2 - iter 117/136 - loss 0.18486750 - time (sec): 12.57 - samples/sec: 3563.05 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:02:26,795 epoch 2 - iter 130/136 - loss 0.17895143 - time (sec): 14.04 - samples/sec: 3539.43 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 20:02:27,497 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:02:27,497 EPOCH 2 done: loss 0.1777 - lr: 0.000027 |
|
2023-10-16 20:02:28,920 DEV : loss 0.13420191407203674 - f1-score (micro avg) 0.7263 |
|
2023-10-16 20:02:28,924 saving best model |
|
2023-10-16 20:02:29,347 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:02:30,481 epoch 3 - iter 13/136 - loss 0.11409157 - time (sec): 1.13 - samples/sec: 3933.71 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:02:32,020 epoch 3 - iter 26/136 - loss 0.11467445 - time (sec): 2.67 - samples/sec: 3836.28 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:02:33,455 epoch 3 - iter 39/136 - loss 0.12145447 - time (sec): 4.10 - samples/sec: 3796.47 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 20:02:34,766 epoch 3 - iter 52/136 - loss 0.11746190 - time (sec): 5.41 - samples/sec: 3876.82 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:02:36,111 epoch 3 - iter 65/136 - loss 0.11099876 - time (sec): 6.76 - samples/sec: 3780.28 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:02:37,658 epoch 3 - iter 78/136 - loss 0.10964286 - time (sec): 8.31 - samples/sec: 3760.78 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 20:02:39,172 epoch 3 - iter 91/136 - loss 0.10854662 - time (sec): 9.82 - samples/sec: 3699.11 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:02:40,662 epoch 3 - iter 104/136 - loss 0.10553309 - time (sec): 11.31 - samples/sec: 3665.93 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:02:41,840 epoch 3 - iter 117/136 - loss 0.10386330 - time (sec): 12.49 - samples/sec: 3659.10 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:02:43,015 epoch 3 - iter 130/136 - loss 0.10556967 - time (sec): 13.66 - samples/sec: 3628.70 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 20:02:43,601 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:02:43,601 EPOCH 3 done: loss 0.1048 - lr: 0.000024 |
|
2023-10-16 20:02:45,029 DEV : loss 0.10961901396512985 - f1-score (micro avg) 0.7672 |
|
2023-10-16 20:02:45,033 saving best model |
|
2023-10-16 20:02:45,458 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:02:47,074 epoch 4 - iter 13/136 - loss 0.05414508 - time (sec): 1.61 - samples/sec: 3350.43 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:02:48,693 epoch 4 - iter 26/136 - loss 0.06020060 - time (sec): 3.23 - samples/sec: 3409.78 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 20:02:50,027 epoch 4 - iter 39/136 - loss 0.06338423 - time (sec): 4.56 - samples/sec: 3491.64 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:02:51,356 epoch 4 - iter 52/136 - loss 0.06382841 - time (sec): 5.89 - samples/sec: 3498.76 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:02:52,585 epoch 4 - iter 65/136 - loss 0.07143233 - time (sec): 7.12 - samples/sec: 3559.23 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 20:02:53,986 epoch 4 - iter 78/136 - loss 0.06767888 - time (sec): 8.52 - samples/sec: 3533.34 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:02:55,338 epoch 4 - iter 91/136 - loss 0.06510267 - time (sec): 9.88 - samples/sec: 3553.60 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:02:56,857 epoch 4 - iter 104/136 - loss 0.06421920 - time (sec): 11.39 - samples/sec: 3500.01 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:02:58,044 epoch 4 - iter 117/136 - loss 0.06376163 - time (sec): 12.58 - samples/sec: 3506.64 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 20:02:59,421 epoch 4 - iter 130/136 - loss 0.06343641 - time (sec): 13.96 - samples/sec: 3550.82 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:03:00,051 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:03:00,052 EPOCH 4 done: loss 0.0623 - lr: 0.000020 |
|
2023-10-16 20:03:01,483 DEV : loss 0.1094428151845932 - f1-score (micro avg) 0.8187 |
|
2023-10-16 20:03:01,487 saving best model |
|
2023-10-16 20:03:01,929 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:03:03,213 epoch 5 - iter 13/136 - loss 0.06363633 - time (sec): 1.28 - samples/sec: 3695.42 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 20:03:04,495 epoch 5 - iter 26/136 - loss 0.05077318 - time (sec): 2.56 - samples/sec: 3710.79 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:03:05,702 epoch 5 - iter 39/136 - loss 0.04917039 - time (sec): 3.77 - samples/sec: 3710.99 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:03:06,976 epoch 5 - iter 52/136 - loss 0.04742205 - time (sec): 5.04 - samples/sec: 3746.79 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 20:03:08,361 epoch 5 - iter 65/136 - loss 0.04702606 - time (sec): 6.43 - samples/sec: 3662.68 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:03:09,736 epoch 5 - iter 78/136 - loss 0.04392786 - time (sec): 7.80 - samples/sec: 3647.38 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:03:11,387 epoch 5 - iter 91/136 - loss 0.04239086 - time (sec): 9.45 - samples/sec: 3597.76 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:03:13,140 epoch 5 - iter 104/136 - loss 0.04275403 - time (sec): 11.21 - samples/sec: 3586.59 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 20:03:14,192 epoch 5 - iter 117/136 - loss 0.04101997 - time (sec): 12.26 - samples/sec: 3623.62 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:03:15,609 epoch 5 - iter 130/136 - loss 0.04182635 - time (sec): 13.67 - samples/sec: 3580.89 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 20:03:16,433 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:03:16,433 EPOCH 5 done: loss 0.0407 - lr: 0.000017 |
|
2023-10-16 20:03:17,863 DEV : loss 0.12483629584312439 - f1-score (micro avg) 0.794 |
|
2023-10-16 20:03:17,867 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:03:19,485 epoch 6 - iter 13/136 - loss 0.02092780 - time (sec): 1.62 - samples/sec: 3434.68 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:03:20,683 epoch 6 - iter 26/136 - loss 0.02259711 - time (sec): 2.81 - samples/sec: 3319.47 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:03:22,007 epoch 6 - iter 39/136 - loss 0.02699121 - time (sec): 4.14 - samples/sec: 3404.15 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 20:03:23,271 epoch 6 - iter 52/136 - loss 0.02802062 - time (sec): 5.40 - samples/sec: 3448.45 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:03:24,305 epoch 6 - iter 65/136 - loss 0.02827787 - time (sec): 6.44 - samples/sec: 3593.52 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:03:25,825 epoch 6 - iter 78/136 - loss 0.02794861 - time (sec): 7.96 - samples/sec: 3520.88 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:03:27,266 epoch 6 - iter 91/136 - loss 0.02931078 - time (sec): 9.40 - samples/sec: 3520.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 20:03:28,867 epoch 6 - iter 104/136 - loss 0.02865060 - time (sec): 11.00 - samples/sec: 3517.87 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:03:30,521 epoch 6 - iter 117/136 - loss 0.02868558 - time (sec): 12.65 - samples/sec: 3547.09 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:03:31,871 epoch 6 - iter 130/136 - loss 0.02845386 - time (sec): 14.00 - samples/sec: 3569.48 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 20:03:32,577 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:03:32,577 EPOCH 6 done: loss 0.0286 - lr: 0.000014 |
|
2023-10-16 20:03:34,007 DEV : loss 0.12027047574520111 - f1-score (micro avg) 0.8112 |
|
2023-10-16 20:03:34,012 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:03:35,433 epoch 7 - iter 13/136 - loss 0.01632546 - time (sec): 1.42 - samples/sec: 3589.72 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:03:36,990 epoch 7 - iter 26/136 - loss 0.02131926 - time (sec): 2.98 - samples/sec: 3686.52 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 20:03:38,300 epoch 7 - iter 39/136 - loss 0.02233165 - time (sec): 4.29 - samples/sec: 3694.17 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:03:39,653 epoch 7 - iter 52/136 - loss 0.02309467 - time (sec): 5.64 - samples/sec: 3711.21 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:03:40,980 epoch 7 - iter 65/136 - loss 0.02166622 - time (sec): 6.97 - samples/sec: 3722.57 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:03:42,415 epoch 7 - iter 78/136 - loss 0.02288074 - time (sec): 8.40 - samples/sec: 3673.78 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 20:03:43,750 epoch 7 - iter 91/136 - loss 0.02251523 - time (sec): 9.74 - samples/sec: 3611.44 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:03:45,246 epoch 7 - iter 104/136 - loss 0.02114757 - time (sec): 11.23 - samples/sec: 3653.68 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:03:46,560 epoch 7 - iter 117/136 - loss 0.02095332 - time (sec): 12.55 - samples/sec: 3646.32 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 20:03:47,802 epoch 7 - iter 130/136 - loss 0.02058114 - time (sec): 13.79 - samples/sec: 3624.00 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:03:48,452 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:03:48,452 EPOCH 7 done: loss 0.0219 - lr: 0.000010 |
|
2023-10-16 20:03:49,879 DEV : loss 0.14382445812225342 - f1-score (micro avg) 0.8089 |
|
2023-10-16 20:03:49,883 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:03:51,220 epoch 8 - iter 13/136 - loss 0.01528701 - time (sec): 1.34 - samples/sec: 3482.08 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 20:03:52,699 epoch 8 - iter 26/136 - loss 0.01188366 - time (sec): 2.82 - samples/sec: 3592.11 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:03:54,251 epoch 8 - iter 39/136 - loss 0.01575956 - time (sec): 4.37 - samples/sec: 3703.30 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:03:55,786 epoch 8 - iter 52/136 - loss 0.01481007 - time (sec): 5.90 - samples/sec: 3622.01 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:03:57,031 epoch 8 - iter 65/136 - loss 0.01596107 - time (sec): 7.15 - samples/sec: 3742.18 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 20:03:58,487 epoch 8 - iter 78/136 - loss 0.01548639 - time (sec): 8.60 - samples/sec: 3649.16 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:04:00,092 epoch 8 - iter 91/136 - loss 0.01463137 - time (sec): 10.21 - samples/sec: 3550.79 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:04:01,164 epoch 8 - iter 104/136 - loss 0.01552289 - time (sec): 11.28 - samples/sec: 3588.90 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 20:04:02,262 epoch 8 - iter 117/136 - loss 0.01605070 - time (sec): 12.38 - samples/sec: 3588.30 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:04:03,810 epoch 8 - iter 130/136 - loss 0.01578008 - time (sec): 13.93 - samples/sec: 3580.13 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 20:04:04,504 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:04:04,504 EPOCH 8 done: loss 0.0159 - lr: 0.000007 |
|
2023-10-16 20:04:05,930 DEV : loss 0.15526820719242096 - f1-score (micro avg) 0.8007 |
|
2023-10-16 20:04:05,934 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:04:07,279 epoch 9 - iter 13/136 - loss 0.02365417 - time (sec): 1.34 - samples/sec: 3545.27 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:04:08,970 epoch 9 - iter 26/136 - loss 0.01767866 - time (sec): 3.03 - samples/sec: 3431.24 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:04:10,268 epoch 9 - iter 39/136 - loss 0.01494692 - time (sec): 4.33 - samples/sec: 3529.52 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:04:11,536 epoch 9 - iter 52/136 - loss 0.01303926 - time (sec): 5.60 - samples/sec: 3691.23 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 20:04:12,742 epoch 9 - iter 65/136 - loss 0.01580492 - time (sec): 6.81 - samples/sec: 3617.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:04:14,120 epoch 9 - iter 78/136 - loss 0.01530890 - time (sec): 8.19 - samples/sec: 3695.01 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:04:15,422 epoch 9 - iter 91/136 - loss 0.01439618 - time (sec): 9.49 - samples/sec: 3653.04 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 20:04:16,551 epoch 9 - iter 104/136 - loss 0.01371922 - time (sec): 10.62 - samples/sec: 3651.04 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:04:17,991 epoch 9 - iter 117/136 - loss 0.01367269 - time (sec): 12.06 - samples/sec: 3649.94 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:04:19,478 epoch 9 - iter 130/136 - loss 0.01301378 - time (sec): 13.54 - samples/sec: 3640.90 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 20:04:20,266 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:04:20,266 EPOCH 9 done: loss 0.0133 - lr: 0.000004 |
|
2023-10-16 20:04:21,692 DEV : loss 0.15417590737342834 - f1-score (micro avg) 0.8082 |
|
2023-10-16 20:04:21,697 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:04:23,167 epoch 10 - iter 13/136 - loss 0.01071197 - time (sec): 1.47 - samples/sec: 3832.99 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:04:24,252 epoch 10 - iter 26/136 - loss 0.00816220 - time (sec): 2.55 - samples/sec: 3810.88 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:04:25,562 epoch 10 - iter 39/136 - loss 0.00707103 - time (sec): 3.86 - samples/sec: 3709.08 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 20:04:27,016 epoch 10 - iter 52/136 - loss 0.00747435 - time (sec): 5.32 - samples/sec: 3696.36 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:04:28,456 epoch 10 - iter 65/136 - loss 0.00708783 - time (sec): 6.76 - samples/sec: 3709.21 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:04:29,860 epoch 10 - iter 78/136 - loss 0.00819418 - time (sec): 8.16 - samples/sec: 3695.83 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 20:04:31,742 epoch 10 - iter 91/136 - loss 0.00924199 - time (sec): 10.04 - samples/sec: 3570.64 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:04:32,971 epoch 10 - iter 104/136 - loss 0.00937275 - time (sec): 11.27 - samples/sec: 3561.47 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:04:34,164 epoch 10 - iter 117/136 - loss 0.00996494 - time (sec): 12.47 - samples/sec: 3555.82 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 20:04:35,710 epoch 10 - iter 130/136 - loss 0.01051771 - time (sec): 14.01 - samples/sec: 3552.06 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 20:04:36,337 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:04:36,337 EPOCH 10 done: loss 0.0106 - lr: 0.000000 |
|
2023-10-16 20:04:37,766 DEV : loss 0.15838788449764252 - f1-score (micro avg) 0.797 |
|
2023-10-16 20:04:38,105 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 20:04:38,106 Loading model from best epoch ... |
|
2023-10-16 20:04:39,473 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG |
|
2023-10-16 20:04:41,439 |
|
Results: |
|
- F-score (micro) 0.7623 |
|
- F-score (macro) 0.701 |
|
- Accuracy 0.6307 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.7821 0.8397 0.8099 312 |
|
PER 0.6944 0.8413 0.7609 208 |
|
ORG 0.4848 0.2909 0.3636 55 |
|
HumanProd 0.8333 0.9091 0.8696 22 |
|
|
|
micro avg 0.7345 0.7923 0.7623 597 |
|
macro avg 0.6987 0.7203 0.7010 597 |
|
weighted avg 0.7261 0.7923 0.7539 597 |
|
|
|
2023-10-16 20:04:41,439 ---------------------------------------------------------------------------------------------------- |
|
|