2023-10-17 20:42:20,536 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,537 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,537 MultiCorpus: 1085 train + 148 dev + 364 test sentences - NER_HIPE_2022 Corpus: 1085 train + 148 dev + 364 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/sv/with_doc_seperator 2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,537 Train: 1085 sentences 2023-10-17 20:42:20,537 (train_with_dev=False, train_with_test=False) 2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,537 Training Params: 2023-10-17 20:42:20,537 - learning_rate: "5e-05" 2023-10-17 20:42:20,537 - mini_batch_size: "8" 2023-10-17 20:42:20,537 - max_epochs: "10" 2023-10-17 20:42:20,537 - shuffle: "True" 2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,537 Plugins: 2023-10-17 20:42:20,537 - TensorboardLogger 2023-10-17 20:42:20,537 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,537 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 20:42:20,537 - metric: "('micro avg', 'f1-score')" 2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,537 Computation: 2023-10-17 20:42:20,537 - compute on device: cuda:0 2023-10-17 20:42:20,537 - embedding storage: none 2023-10-17 20:42:20,537 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,538 Model training base path: "hmbench-newseye/sv-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-17 20:42:20,538 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,538 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:20,538 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 20:42:21,708 epoch 1 - iter 13/136 - loss 3.61948799 - time (sec): 1.17 - samples/sec: 4249.82 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:42:23,195 epoch 1 - iter 26/136 - loss 3.13237506 - time (sec): 2.66 - samples/sec: 3779.78 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:42:24,833 epoch 1 - iter 39/136 - loss 2.34189086 - time (sec): 4.29 - samples/sec: 3817.99 - lr: 0.000014 - momentum: 0.000000 2023-10-17 20:42:26,155 epoch 1 - iter 52/136 - loss 1.92616474 - time (sec): 5.62 - samples/sec: 3869.56 - lr: 0.000019 - momentum: 0.000000 2023-10-17 20:42:27,501 epoch 1 - iter 65/136 - loss 1.70538327 - time (sec): 6.96 - samples/sec: 3747.85 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:42:28,954 epoch 1 - iter 78/136 - loss 1.51058071 - time (sec): 8.42 - samples/sec: 3664.50 - lr: 0.000028 - momentum: 0.000000 2023-10-17 20:42:30,242 epoch 1 - iter 91/136 - loss 1.36273296 - time (sec): 9.70 - samples/sec: 3634.79 - lr: 0.000033 - momentum: 0.000000 2023-10-17 20:42:31,421 epoch 1 - iter 104/136 - loss 1.23429361 - time (sec): 10.88 - samples/sec: 3663.48 - lr: 0.000038 - momentum: 0.000000 2023-10-17 20:42:32,684 epoch 1 - iter 117/136 - loss 1.12489260 - time (sec): 12.15 - samples/sec: 3684.80 - lr: 0.000043 - momentum: 0.000000 2023-10-17 20:42:34,173 epoch 1 - iter 130/136 - loss 1.01914801 - time (sec): 13.63 - samples/sec: 3691.10 - lr: 0.000047 - momentum: 0.000000 2023-10-17 20:42:34,636 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:34,636 EPOCH 1 done: loss 0.9934 - lr: 0.000047 2023-10-17 20:42:35,690 DEV : loss 0.17961926758289337 - f1-score (micro avg) 0.5662 2023-10-17 20:42:35,695 saving best model 2023-10-17 20:42:36,051 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:37,254 epoch 2 - iter 13/136 - loss 0.15228697 - time (sec): 1.20 - samples/sec: 3859.37 - lr: 0.000050 - momentum: 0.000000 2023-10-17 20:42:38,649 epoch 2 - iter 26/136 - loss 0.16566093 - time (sec): 2.60 - samples/sec: 3461.64 - lr: 0.000049 - momentum: 0.000000 2023-10-17 20:42:40,011 epoch 2 - iter 39/136 - loss 0.17398233 - time (sec): 3.96 - samples/sec: 3553.30 - lr: 0.000048 - momentum: 0.000000 2023-10-17 20:42:41,406 epoch 2 - iter 52/136 - loss 0.16043159 - time (sec): 5.35 - samples/sec: 3638.07 - lr: 0.000048 - momentum: 0.000000 2023-10-17 20:42:42,617 epoch 2 - iter 65/136 - loss 0.15531100 - time (sec): 6.56 - samples/sec: 3667.80 - lr: 0.000047 - momentum: 0.000000 2023-10-17 20:42:44,001 epoch 2 - iter 78/136 - loss 0.14560993 - time (sec): 7.95 - samples/sec: 3688.02 - lr: 0.000047 - momentum: 0.000000 2023-10-17 20:42:45,411 epoch 2 - iter 91/136 - loss 0.15781392 - time (sec): 9.36 - samples/sec: 3668.01 - lr: 0.000046 - momentum: 0.000000 2023-10-17 20:42:46,561 epoch 2 - iter 104/136 - loss 0.15797681 - time (sec): 10.51 - samples/sec: 3665.01 - lr: 0.000046 - momentum: 0.000000 2023-10-17 20:42:48,140 epoch 2 - iter 117/136 - loss 0.15530704 - time (sec): 12.09 - samples/sec: 3654.54 - lr: 0.000045 - momentum: 0.000000 2023-10-17 20:42:49,877 epoch 2 - iter 130/136 - loss 0.15218247 - time (sec): 13.82 - samples/sec: 3627.87 - lr: 0.000045 - momentum: 0.000000 2023-10-17 20:42:50,416 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:50,416 EPOCH 2 done: loss 0.1500 - lr: 0.000045 2023-10-17 20:42:51,874 DEV : loss 0.11248722672462463 - f1-score (micro avg) 0.7532 2023-10-17 20:42:51,879 saving best model 2023-10-17 20:42:52,359 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:42:53,518 epoch 3 - iter 13/136 - loss 0.10254776 - time (sec): 1.16 - samples/sec: 3639.39 - lr: 0.000044 - momentum: 0.000000 2023-10-17 20:42:54,973 epoch 3 - iter 26/136 - loss 0.08535292 - time (sec): 2.61 - samples/sec: 3632.11 - lr: 0.000043 - momentum: 0.000000 2023-10-17 20:42:56,426 epoch 3 - iter 39/136 - loss 0.08605534 - time (sec): 4.07 - samples/sec: 3592.38 - lr: 0.000043 - momentum: 0.000000 2023-10-17 20:42:57,652 epoch 3 - iter 52/136 - loss 0.08805134 - time (sec): 5.29 - samples/sec: 3651.61 - lr: 0.000042 - momentum: 0.000000 2023-10-17 20:42:59,085 epoch 3 - iter 65/136 - loss 0.09492906 - time (sec): 6.72 - samples/sec: 3646.33 - lr: 0.000042 - momentum: 0.000000 2023-10-17 20:43:00,479 epoch 3 - iter 78/136 - loss 0.09268581 - time (sec): 8.12 - samples/sec: 3714.31 - lr: 0.000041 - momentum: 0.000000 2023-10-17 20:43:01,836 epoch 3 - iter 91/136 - loss 0.09202451 - time (sec): 9.48 - samples/sec: 3675.97 - lr: 0.000041 - momentum: 0.000000 2023-10-17 20:43:03,179 epoch 3 - iter 104/136 - loss 0.08747935 - time (sec): 10.82 - samples/sec: 3663.44 - lr: 0.000040 - momentum: 0.000000 2023-10-17 20:43:04,878 epoch 3 - iter 117/136 - loss 0.08598807 - time (sec): 12.52 - samples/sec: 3638.22 - lr: 0.000040 - momentum: 0.000000 2023-10-17 20:43:06,097 epoch 3 - iter 130/136 - loss 0.08406916 - time (sec): 13.74 - samples/sec: 3649.73 - lr: 0.000039 - momentum: 0.000000 2023-10-17 20:43:06,660 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:43:06,660 EPOCH 3 done: loss 0.0864 - lr: 0.000039 2023-10-17 20:43:08,273 DEV : loss 0.10026960074901581 - f1-score (micro avg) 0.776 2023-10-17 20:43:08,278 saving best model 2023-10-17 20:43:08,718 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:43:10,225 epoch 4 - iter 13/136 - loss 0.04229442 - time (sec): 1.51 - samples/sec: 3161.21 - lr: 0.000038 - momentum: 0.000000 2023-10-17 20:43:11,715 epoch 4 - iter 26/136 - loss 0.05020601 - time (sec): 3.00 - samples/sec: 3249.66 - lr: 0.000038 - momentum: 0.000000 2023-10-17 20:43:13,276 epoch 4 - iter 39/136 - loss 0.04768072 - time (sec): 4.56 - samples/sec: 3218.31 - lr: 0.000037 - momentum: 0.000000 2023-10-17 20:43:14,752 epoch 4 - iter 52/136 - loss 0.04433361 - time (sec): 6.03 - samples/sec: 3268.99 - lr: 0.000037 - momentum: 0.000000 2023-10-17 20:43:16,373 epoch 4 - iter 65/136 - loss 0.04410039 - time (sec): 7.65 - samples/sec: 3356.66 - lr: 0.000036 - momentum: 0.000000 2023-10-17 20:43:17,660 epoch 4 - iter 78/136 - loss 0.04603029 - time (sec): 8.94 - samples/sec: 3427.50 - lr: 0.000036 - momentum: 0.000000 2023-10-17 20:43:19,118 epoch 4 - iter 91/136 - loss 0.04570230 - time (sec): 10.40 - samples/sec: 3411.83 - lr: 0.000035 - momentum: 0.000000 2023-10-17 20:43:20,427 epoch 4 - iter 104/136 - loss 0.05176775 - time (sec): 11.71 - samples/sec: 3444.17 - lr: 0.000035 - momentum: 0.000000 2023-10-17 20:43:21,663 epoch 4 - iter 117/136 - loss 0.05143245 - time (sec): 12.94 - samples/sec: 3480.37 - lr: 0.000034 - momentum: 0.000000 2023-10-17 20:43:23,091 epoch 4 - iter 130/136 - loss 0.04955538 - time (sec): 14.37 - samples/sec: 3462.37 - lr: 0.000034 - momentum: 0.000000 2023-10-17 20:43:23,671 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:43:23,671 EPOCH 4 done: loss 0.0496 - lr: 0.000034 2023-10-17 20:43:25,138 DEV : loss 0.11402004957199097 - f1-score (micro avg) 0.792 2023-10-17 20:43:25,144 saving best model 2023-10-17 20:43:25,611 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:43:26,954 epoch 5 - iter 13/136 - loss 0.03063410 - time (sec): 1.32 - samples/sec: 4249.37 - lr: 0.000033 - momentum: 0.000000 2023-10-17 20:43:28,426 epoch 5 - iter 26/136 - loss 0.02995518 - time (sec): 2.80 - samples/sec: 4003.21 - lr: 0.000032 - momentum: 0.000000 2023-10-17 20:43:29,698 epoch 5 - iter 39/136 - loss 0.02782725 - time (sec): 4.07 - samples/sec: 3852.73 - lr: 0.000032 - momentum: 0.000000 2023-10-17 20:43:31,184 epoch 5 - iter 52/136 - loss 0.02730023 - time (sec): 5.55 - samples/sec: 3800.64 - lr: 0.000031 - momentum: 0.000000 2023-10-17 20:43:32,669 epoch 5 - iter 65/136 - loss 0.03436385 - time (sec): 7.04 - samples/sec: 3746.35 - lr: 0.000031 - momentum: 0.000000 2023-10-17 20:43:33,900 epoch 5 - iter 78/136 - loss 0.03270278 - time (sec): 8.27 - samples/sec: 3724.77 - lr: 0.000030 - momentum: 0.000000 2023-10-17 20:43:35,271 epoch 5 - iter 91/136 - loss 0.03166123 - time (sec): 9.64 - samples/sec: 3692.33 - lr: 0.000030 - momentum: 0.000000 2023-10-17 20:43:36,407 epoch 5 - iter 104/136 - loss 0.03261930 - time (sec): 10.78 - samples/sec: 3722.45 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:43:37,855 epoch 5 - iter 117/136 - loss 0.03205863 - time (sec): 12.23 - samples/sec: 3715.32 - lr: 0.000029 - momentum: 0.000000 2023-10-17 20:43:39,189 epoch 5 - iter 130/136 - loss 0.03339591 - time (sec): 13.56 - samples/sec: 3701.65 - lr: 0.000028 - momentum: 0.000000 2023-10-17 20:43:39,701 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:43:39,701 EPOCH 5 done: loss 0.0349 - lr: 0.000028 2023-10-17 20:43:41,161 DEV : loss 0.1294308602809906 - f1-score (micro avg) 0.8029 2023-10-17 20:43:41,171 saving best model 2023-10-17 20:43:41,686 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:43:43,361 epoch 6 - iter 13/136 - loss 0.02234781 - time (sec): 1.67 - samples/sec: 3017.58 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:43:44,615 epoch 6 - iter 26/136 - loss 0.03157132 - time (sec): 2.93 - samples/sec: 3241.45 - lr: 0.000027 - momentum: 0.000000 2023-10-17 20:43:45,968 epoch 6 - iter 39/136 - loss 0.02896451 - time (sec): 4.28 - samples/sec: 3419.38 - lr: 0.000026 - momentum: 0.000000 2023-10-17 20:43:47,486 epoch 6 - iter 52/136 - loss 0.02436989 - time (sec): 5.80 - samples/sec: 3419.02 - lr: 0.000026 - momentum: 0.000000 2023-10-17 20:43:48,962 epoch 6 - iter 65/136 - loss 0.02415456 - time (sec): 7.27 - samples/sec: 3440.57 - lr: 0.000025 - momentum: 0.000000 2023-10-17 20:43:50,588 epoch 6 - iter 78/136 - loss 0.02331112 - time (sec): 8.90 - samples/sec: 3451.15 - lr: 0.000025 - momentum: 0.000000 2023-10-17 20:43:51,950 epoch 6 - iter 91/136 - loss 0.02208564 - time (sec): 10.26 - samples/sec: 3438.57 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:43:53,304 epoch 6 - iter 104/136 - loss 0.02410552 - time (sec): 11.62 - samples/sec: 3447.65 - lr: 0.000024 - momentum: 0.000000 2023-10-17 20:43:54,810 epoch 6 - iter 117/136 - loss 0.02282725 - time (sec): 13.12 - samples/sec: 3473.11 - lr: 0.000023 - momentum: 0.000000 2023-10-17 20:43:56,039 epoch 6 - iter 130/136 - loss 0.02215833 - time (sec): 14.35 - samples/sec: 3479.14 - lr: 0.000023 - momentum: 0.000000 2023-10-17 20:43:56,545 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:43:56,545 EPOCH 6 done: loss 0.0227 - lr: 0.000023 2023-10-17 20:43:58,074 DEV : loss 0.13876760005950928 - f1-score (micro avg) 0.7971 2023-10-17 20:43:58,080 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:43:59,378 epoch 7 - iter 13/136 - loss 0.00755380 - time (sec): 1.30 - samples/sec: 3785.36 - lr: 0.000022 - momentum: 0.000000 2023-10-17 20:44:00,973 epoch 7 - iter 26/136 - loss 0.00992484 - time (sec): 2.89 - samples/sec: 3820.31 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:44:02,332 epoch 7 - iter 39/136 - loss 0.01261973 - time (sec): 4.25 - samples/sec: 3610.47 - lr: 0.000021 - momentum: 0.000000 2023-10-17 20:44:03,564 epoch 7 - iter 52/136 - loss 0.01262043 - time (sec): 5.48 - samples/sec: 3560.56 - lr: 0.000020 - momentum: 0.000000 2023-10-17 20:44:04,979 epoch 7 - iter 65/136 - loss 0.01417440 - time (sec): 6.90 - samples/sec: 3630.20 - lr: 0.000020 - momentum: 0.000000 2023-10-17 20:44:06,376 epoch 7 - iter 78/136 - loss 0.01699668 - time (sec): 8.29 - samples/sec: 3692.79 - lr: 0.000019 - momentum: 0.000000 2023-10-17 20:44:07,667 epoch 7 - iter 91/136 - loss 0.01622148 - time (sec): 9.59 - samples/sec: 3717.77 - lr: 0.000019 - momentum: 0.000000 2023-10-17 20:44:09,054 epoch 7 - iter 104/136 - loss 0.01831688 - time (sec): 10.97 - samples/sec: 3672.06 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:44:10,463 epoch 7 - iter 117/136 - loss 0.01805275 - time (sec): 12.38 - samples/sec: 3660.48 - lr: 0.000018 - momentum: 0.000000 2023-10-17 20:44:12,008 epoch 7 - iter 130/136 - loss 0.01673109 - time (sec): 13.93 - samples/sec: 3620.85 - lr: 0.000017 - momentum: 0.000000 2023-10-17 20:44:12,549 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:44:12,549 EPOCH 7 done: loss 0.0172 - lr: 0.000017 2023-10-17 20:44:14,050 DEV : loss 0.135068878531456 - f1-score (micro avg) 0.8007 2023-10-17 20:44:14,055 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:44:15,400 epoch 8 - iter 13/136 - loss 0.02534486 - time (sec): 1.34 - samples/sec: 3544.29 - lr: 0.000016 - momentum: 0.000000 2023-10-17 20:44:17,001 epoch 8 - iter 26/136 - loss 0.01371569 - time (sec): 2.94 - samples/sec: 3276.99 - lr: 0.000016 - momentum: 0.000000 2023-10-17 20:44:18,363 epoch 8 - iter 39/136 - loss 0.01519439 - time (sec): 4.31 - samples/sec: 3316.06 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:44:19,814 epoch 8 - iter 52/136 - loss 0.01421570 - time (sec): 5.76 - samples/sec: 3334.48 - lr: 0.000015 - momentum: 0.000000 2023-10-17 20:44:21,129 epoch 8 - iter 65/136 - loss 0.01455442 - time (sec): 7.07 - samples/sec: 3396.61 - lr: 0.000014 - momentum: 0.000000 2023-10-17 20:44:22,934 epoch 8 - iter 78/136 - loss 0.01281225 - time (sec): 8.88 - samples/sec: 3406.72 - lr: 0.000014 - momentum: 0.000000 2023-10-17 20:44:24,198 epoch 8 - iter 91/136 - loss 0.01236698 - time (sec): 10.14 - samples/sec: 3450.95 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:44:25,545 epoch 8 - iter 104/136 - loss 0.01253968 - time (sec): 11.49 - samples/sec: 3507.74 - lr: 0.000013 - momentum: 0.000000 2023-10-17 20:44:26,849 epoch 8 - iter 117/136 - loss 0.01184043 - time (sec): 12.79 - samples/sec: 3494.93 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:44:28,189 epoch 8 - iter 130/136 - loss 0.01190958 - time (sec): 14.13 - samples/sec: 3542.44 - lr: 0.000012 - momentum: 0.000000 2023-10-17 20:44:28,676 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:44:28,676 EPOCH 8 done: loss 0.0115 - lr: 0.000012 2023-10-17 20:44:30,188 DEV : loss 0.15049846470355988 - f1-score (micro avg) 0.8125 2023-10-17 20:44:30,196 saving best model 2023-10-17 20:44:30,754 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:44:32,109 epoch 9 - iter 13/136 - loss 0.00518564 - time (sec): 1.35 - samples/sec: 3834.24 - lr: 0.000011 - momentum: 0.000000 2023-10-17 20:44:33,708 epoch 9 - iter 26/136 - loss 0.01160391 - time (sec): 2.95 - samples/sec: 3762.86 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:44:35,210 epoch 9 - iter 39/136 - loss 0.00814467 - time (sec): 4.45 - samples/sec: 3583.25 - lr: 0.000010 - momentum: 0.000000 2023-10-17 20:44:36,459 epoch 9 - iter 52/136 - loss 0.00815978 - time (sec): 5.70 - samples/sec: 3538.25 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:44:37,693 epoch 9 - iter 65/136 - loss 0.01000613 - time (sec): 6.94 - samples/sec: 3602.20 - lr: 0.000009 - momentum: 0.000000 2023-10-17 20:44:39,091 epoch 9 - iter 78/136 - loss 0.00866915 - time (sec): 8.33 - samples/sec: 3638.27 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:44:40,432 epoch 9 - iter 91/136 - loss 0.00783014 - time (sec): 9.68 - samples/sec: 3658.37 - lr: 0.000008 - momentum: 0.000000 2023-10-17 20:44:41,715 epoch 9 - iter 104/136 - loss 0.00795667 - time (sec): 10.96 - samples/sec: 3659.40 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:44:43,025 epoch 9 - iter 117/136 - loss 0.00776984 - time (sec): 12.27 - samples/sec: 3689.62 - lr: 0.000007 - momentum: 0.000000 2023-10-17 20:44:44,402 epoch 9 - iter 130/136 - loss 0.00793713 - time (sec): 13.65 - samples/sec: 3682.03 - lr: 0.000006 - momentum: 0.000000 2023-10-17 20:44:44,873 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:44:44,873 EPOCH 9 done: loss 0.0083 - lr: 0.000006 2023-10-17 20:44:46,356 DEV : loss 0.1609845757484436 - f1-score (micro avg) 0.8044 2023-10-17 20:44:46,362 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:44:47,715 epoch 10 - iter 13/136 - loss 0.00546006 - time (sec): 1.35 - samples/sec: 3993.02 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:44:49,007 epoch 10 - iter 26/136 - loss 0.00322110 - time (sec): 2.64 - samples/sec: 3865.19 - lr: 0.000005 - momentum: 0.000000 2023-10-17 20:44:50,499 epoch 10 - iter 39/136 - loss 0.00273811 - time (sec): 4.14 - samples/sec: 3629.86 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:44:51,704 epoch 10 - iter 52/136 - loss 0.00447940 - time (sec): 5.34 - samples/sec: 3532.07 - lr: 0.000004 - momentum: 0.000000 2023-10-17 20:44:53,162 epoch 10 - iter 65/136 - loss 0.00412880 - time (sec): 6.80 - samples/sec: 3551.89 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:44:54,559 epoch 10 - iter 78/136 - loss 0.00483931 - time (sec): 8.20 - samples/sec: 3554.87 - lr: 0.000003 - momentum: 0.000000 2023-10-17 20:44:55,825 epoch 10 - iter 91/136 - loss 0.00597590 - time (sec): 9.46 - samples/sec: 3563.30 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:44:57,492 epoch 10 - iter 104/136 - loss 0.00684780 - time (sec): 11.13 - samples/sec: 3552.53 - lr: 0.000002 - momentum: 0.000000 2023-10-17 20:44:58,987 epoch 10 - iter 117/136 - loss 0.00648974 - time (sec): 12.62 - samples/sec: 3553.46 - lr: 0.000001 - momentum: 0.000000 2023-10-17 20:45:00,225 epoch 10 - iter 130/136 - loss 0.00586892 - time (sec): 13.86 - samples/sec: 3593.40 - lr: 0.000000 - momentum: 0.000000 2023-10-17 20:45:00,771 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:45:00,771 EPOCH 10 done: loss 0.0060 - lr: 0.000000 2023-10-17 20:45:02,256 DEV : loss 0.1766374260187149 - f1-score (micro avg) 0.8 2023-10-17 20:45:02,612 ---------------------------------------------------------------------------------------------------- 2023-10-17 20:45:02,613 Loading model from best epoch ... 2023-10-17 20:45:04,070 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 20:45:06,110 Results: - F-score (micro) 0.8003 - F-score (macro) 0.7578 - Accuracy 0.6851 By class: precision recall f1-score support LOC 0.8081 0.8910 0.8476 312 PER 0.7126 0.8702 0.7835 208 ORG 0.5957 0.5091 0.5490 55 HumanProd 0.8000 0.9091 0.8511 22 micro avg 0.7567 0.8492 0.8003 597 macro avg 0.7291 0.7948 0.7578 597 weighted avg 0.7550 0.8492 0.7979 597 2023-10-17 20:45:06,110 ----------------------------------------------------------------------------------------------------