2023-10-13 09:38:45,457 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:45,458 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 09:38:45,458 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:45,458 MultiCorpus: 1214 train + 266 dev + 251 test sentences - NER_HIPE_2022 Corpus: 1214 train + 266 dev + 251 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/en/with_doc_seperator 2023-10-13 09:38:45,458 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:45,458 Train: 1214 sentences 2023-10-13 09:38:45,458 (train_with_dev=False, train_with_test=False) 2023-10-13 09:38:45,458 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:45,458 Training Params: 2023-10-13 09:38:45,458 - learning_rate: "3e-05" 2023-10-13 09:38:45,458 - mini_batch_size: "8" 2023-10-13 09:38:45,458 - max_epochs: "10" 2023-10-13 09:38:45,459 - shuffle: "True" 2023-10-13 09:38:45,459 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:45,459 Plugins: 2023-10-13 09:38:45,459 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 09:38:45,459 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:45,459 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 09:38:45,459 - metric: "('micro avg', 'f1-score')" 2023-10-13 09:38:45,459 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:45,459 Computation: 2023-10-13 09:38:45,459 - compute on device: cuda:0 2023-10-13 09:38:45,459 - embedding storage: none 2023-10-13 09:38:45,459 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:45,459 Model training base path: "hmbench-ajmc/en-dbmdz/bert-base-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" 2023-10-13 09:38:45,459 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:45,459 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:46,317 epoch 1 - iter 15/152 - loss 3.42039858 - time (sec): 0.86 - samples/sec: 3404.00 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:38:47,165 epoch 1 - iter 30/152 - loss 3.16945633 - time (sec): 1.70 - samples/sec: 3585.05 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:38:48,014 epoch 1 - iter 45/152 - loss 2.66696432 - time (sec): 2.55 - samples/sec: 3591.76 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:38:48,844 epoch 1 - iter 60/152 - loss 2.17975677 - time (sec): 3.38 - samples/sec: 3617.27 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:38:49,617 epoch 1 - iter 75/152 - loss 1.90302387 - time (sec): 4.16 - samples/sec: 3608.59 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:38:50,472 epoch 1 - iter 90/152 - loss 1.68392529 - time (sec): 5.01 - samples/sec: 3617.13 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:38:51,320 epoch 1 - iter 105/152 - loss 1.50548696 - time (sec): 5.86 - samples/sec: 3663.47 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:38:52,163 epoch 1 - iter 120/152 - loss 1.36344230 - time (sec): 6.70 - samples/sec: 3633.81 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:38:53,021 epoch 1 - iter 135/152 - loss 1.24394055 - time (sec): 7.56 - samples/sec: 3623.86 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:38:53,913 epoch 1 - iter 150/152 - loss 1.14577441 - time (sec): 8.45 - samples/sec: 3619.01 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:38:54,019 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:54,019 EPOCH 1 done: loss 1.1329 - lr: 0.000029 2023-10-13 09:38:54,947 DEV : loss 0.2724057137966156 - f1-score (micro avg) 0.5102 2023-10-13 09:38:54,953 saving best model 2023-10-13 09:38:55,334 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:38:56,202 epoch 2 - iter 15/152 - loss 0.26672296 - time (sec): 0.87 - samples/sec: 3551.23 - lr: 0.000030 - momentum: 0.000000 2023-10-13 09:38:57,048 epoch 2 - iter 30/152 - loss 0.24386089 - time (sec): 1.71 - samples/sec: 3603.80 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:38:57,954 epoch 2 - iter 45/152 - loss 0.23649634 - time (sec): 2.62 - samples/sec: 3467.77 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:38:58,785 epoch 2 - iter 60/152 - loss 0.22440225 - time (sec): 3.45 - samples/sec: 3509.89 - lr: 0.000029 - momentum: 0.000000 2023-10-13 09:38:59,627 epoch 2 - iter 75/152 - loss 0.20433870 - time (sec): 4.29 - samples/sec: 3531.29 - lr: 0.000028 - momentum: 0.000000 2023-10-13 09:39:00,468 epoch 2 - iter 90/152 - loss 0.19792887 - time (sec): 5.13 - samples/sec: 3565.67 - lr: 0.000028 - momentum: 0.000000 2023-10-13 09:39:01,305 epoch 2 - iter 105/152 - loss 0.19571936 - time (sec): 5.97 - samples/sec: 3609.01 - lr: 0.000028 - momentum: 0.000000 2023-10-13 09:39:02,119 epoch 2 - iter 120/152 - loss 0.19015630 - time (sec): 6.78 - samples/sec: 3599.80 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:39:02,985 epoch 2 - iter 135/152 - loss 0.17820321 - time (sec): 7.65 - samples/sec: 3610.79 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:39:03,844 epoch 2 - iter 150/152 - loss 0.17480304 - time (sec): 8.51 - samples/sec: 3610.91 - lr: 0.000027 - momentum: 0.000000 2023-10-13 09:39:03,952 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:03,952 EPOCH 2 done: loss 0.1742 - lr: 0.000027 2023-10-13 09:39:04,944 DEV : loss 0.14681921899318695 - f1-score (micro avg) 0.7847 2023-10-13 09:39:04,951 saving best model 2023-10-13 09:39:05,414 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:06,290 epoch 3 - iter 15/152 - loss 0.08309311 - time (sec): 0.87 - samples/sec: 3654.07 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:39:07,190 epoch 3 - iter 30/152 - loss 0.08445615 - time (sec): 1.77 - samples/sec: 3581.28 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:39:08,100 epoch 3 - iter 45/152 - loss 0.08356708 - time (sec): 2.68 - samples/sec: 3437.01 - lr: 0.000026 - momentum: 0.000000 2023-10-13 09:39:09,006 epoch 3 - iter 60/152 - loss 0.08378525 - time (sec): 3.59 - samples/sec: 3378.19 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:39:09,891 epoch 3 - iter 75/152 - loss 0.09290886 - time (sec): 4.47 - samples/sec: 3375.02 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:39:10,833 epoch 3 - iter 90/152 - loss 0.09109264 - time (sec): 5.42 - samples/sec: 3374.63 - lr: 0.000025 - momentum: 0.000000 2023-10-13 09:39:11,706 epoch 3 - iter 105/152 - loss 0.09013940 - time (sec): 6.29 - samples/sec: 3433.31 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:39:12,580 epoch 3 - iter 120/152 - loss 0.08581588 - time (sec): 7.16 - samples/sec: 3402.88 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:39:13,427 epoch 3 - iter 135/152 - loss 0.09150916 - time (sec): 8.01 - samples/sec: 3446.20 - lr: 0.000024 - momentum: 0.000000 2023-10-13 09:39:14,322 epoch 3 - iter 150/152 - loss 0.09099483 - time (sec): 8.91 - samples/sec: 3448.19 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:39:14,418 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:14,418 EPOCH 3 done: loss 0.0905 - lr: 0.000023 2023-10-13 09:39:15,380 DEV : loss 0.13434813916683197 - f1-score (micro avg) 0.8265 2023-10-13 09:39:15,386 saving best model 2023-10-13 09:39:15,902 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:16,735 epoch 4 - iter 15/152 - loss 0.08829873 - time (sec): 0.82 - samples/sec: 4020.27 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:39:17,568 epoch 4 - iter 30/152 - loss 0.08313918 - time (sec): 1.66 - samples/sec: 3725.43 - lr: 0.000023 - momentum: 0.000000 2023-10-13 09:39:18,464 epoch 4 - iter 45/152 - loss 0.08554648 - time (sec): 2.55 - samples/sec: 3665.02 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:39:19,288 epoch 4 - iter 60/152 - loss 0.08081654 - time (sec): 3.38 - samples/sec: 3659.32 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:39:20,096 epoch 4 - iter 75/152 - loss 0.07351109 - time (sec): 4.18 - samples/sec: 3664.35 - lr: 0.000022 - momentum: 0.000000 2023-10-13 09:39:20,935 epoch 4 - iter 90/152 - loss 0.06888951 - time (sec): 5.02 - samples/sec: 3703.82 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:39:21,772 epoch 4 - iter 105/152 - loss 0.06755660 - time (sec): 5.86 - samples/sec: 3664.70 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:39:22,628 epoch 4 - iter 120/152 - loss 0.06779828 - time (sec): 6.72 - samples/sec: 3651.94 - lr: 0.000021 - momentum: 0.000000 2023-10-13 09:39:23,449 epoch 4 - iter 135/152 - loss 0.06542532 - time (sec): 7.54 - samples/sec: 3656.96 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:39:24,301 epoch 4 - iter 150/152 - loss 0.06327155 - time (sec): 8.39 - samples/sec: 3647.39 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:39:24,420 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:24,420 EPOCH 4 done: loss 0.0626 - lr: 0.000020 2023-10-13 09:39:25,359 DEV : loss 0.14982837438583374 - f1-score (micro avg) 0.8467 2023-10-13 09:39:25,365 saving best model 2023-10-13 09:39:25,900 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:26,831 epoch 5 - iter 15/152 - loss 0.05015278 - time (sec): 0.93 - samples/sec: 3268.25 - lr: 0.000020 - momentum: 0.000000 2023-10-13 09:39:27,659 epoch 5 - iter 30/152 - loss 0.04675951 - time (sec): 1.76 - samples/sec: 3506.32 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:39:28,495 epoch 5 - iter 45/152 - loss 0.04678507 - time (sec): 2.59 - samples/sec: 3551.35 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:39:29,296 epoch 5 - iter 60/152 - loss 0.04223145 - time (sec): 3.39 - samples/sec: 3581.63 - lr: 0.000019 - momentum: 0.000000 2023-10-13 09:39:30,132 epoch 5 - iter 75/152 - loss 0.03860938 - time (sec): 4.23 - samples/sec: 3581.73 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:39:30,969 epoch 5 - iter 90/152 - loss 0.04289756 - time (sec): 5.07 - samples/sec: 3607.66 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:39:31,797 epoch 5 - iter 105/152 - loss 0.04696796 - time (sec): 5.89 - samples/sec: 3610.86 - lr: 0.000018 - momentum: 0.000000 2023-10-13 09:39:32,628 epoch 5 - iter 120/152 - loss 0.04663757 - time (sec): 6.73 - samples/sec: 3616.98 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:39:33,497 epoch 5 - iter 135/152 - loss 0.04782612 - time (sec): 7.59 - samples/sec: 3632.63 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:39:34,324 epoch 5 - iter 150/152 - loss 0.04654518 - time (sec): 8.42 - samples/sec: 3639.00 - lr: 0.000017 - momentum: 0.000000 2023-10-13 09:39:34,445 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:34,445 EPOCH 5 done: loss 0.0460 - lr: 0.000017 2023-10-13 09:39:35,419 DEV : loss 0.16173015534877777 - f1-score (micro avg) 0.8517 2023-10-13 09:39:35,425 saving best model 2023-10-13 09:39:35,958 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:36,782 epoch 6 - iter 15/152 - loss 0.05787612 - time (sec): 0.82 - samples/sec: 3578.82 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:39:37,600 epoch 6 - iter 30/152 - loss 0.04226903 - time (sec): 1.64 - samples/sec: 3647.93 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:39:38,461 epoch 6 - iter 45/152 - loss 0.03419371 - time (sec): 2.50 - samples/sec: 3565.26 - lr: 0.000016 - momentum: 0.000000 2023-10-13 09:39:39,340 epoch 6 - iter 60/152 - loss 0.03279773 - time (sec): 3.38 - samples/sec: 3543.86 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:39:40,197 epoch 6 - iter 75/152 - loss 0.03556143 - time (sec): 4.24 - samples/sec: 3577.46 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:39:41,058 epoch 6 - iter 90/152 - loss 0.03362761 - time (sec): 5.10 - samples/sec: 3569.62 - lr: 0.000015 - momentum: 0.000000 2023-10-13 09:39:41,915 epoch 6 - iter 105/152 - loss 0.03321623 - time (sec): 5.96 - samples/sec: 3604.36 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:39:42,783 epoch 6 - iter 120/152 - loss 0.03458796 - time (sec): 6.82 - samples/sec: 3581.45 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:39:43,679 epoch 6 - iter 135/152 - loss 0.03512774 - time (sec): 7.72 - samples/sec: 3567.76 - lr: 0.000014 - momentum: 0.000000 2023-10-13 09:39:44,617 epoch 6 - iter 150/152 - loss 0.03640818 - time (sec): 8.66 - samples/sec: 3550.36 - lr: 0.000013 - momentum: 0.000000 2023-10-13 09:39:44,737 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:44,737 EPOCH 6 done: loss 0.0362 - lr: 0.000013 2023-10-13 09:39:45,738 DEV : loss 0.17967215180397034 - f1-score (micro avg) 0.8357 2023-10-13 09:39:45,747 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:46,642 epoch 7 - iter 15/152 - loss 0.02708283 - time (sec): 0.89 - samples/sec: 3370.29 - lr: 0.000013 - momentum: 0.000000 2023-10-13 09:39:47,591 epoch 7 - iter 30/152 - loss 0.01696927 - time (sec): 1.84 - samples/sec: 3295.33 - lr: 0.000013 - momentum: 0.000000 2023-10-13 09:39:48,515 epoch 7 - iter 45/152 - loss 0.01916225 - time (sec): 2.77 - samples/sec: 3249.69 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:39:49,428 epoch 7 - iter 60/152 - loss 0.02226671 - time (sec): 3.68 - samples/sec: 3260.09 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:39:50,360 epoch 7 - iter 75/152 - loss 0.02113811 - time (sec): 4.61 - samples/sec: 3294.64 - lr: 0.000012 - momentum: 0.000000 2023-10-13 09:39:51,290 epoch 7 - iter 90/152 - loss 0.02114491 - time (sec): 5.54 - samples/sec: 3301.77 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:39:52,210 epoch 7 - iter 105/152 - loss 0.02269103 - time (sec): 6.46 - samples/sec: 3337.71 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:39:53,087 epoch 7 - iter 120/152 - loss 0.02681876 - time (sec): 7.34 - samples/sec: 3353.87 - lr: 0.000011 - momentum: 0.000000 2023-10-13 09:39:53,940 epoch 7 - iter 135/152 - loss 0.02578900 - time (sec): 8.19 - samples/sec: 3373.66 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:39:54,763 epoch 7 - iter 150/152 - loss 0.02692004 - time (sec): 9.01 - samples/sec: 3393.20 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:39:54,872 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:54,872 EPOCH 7 done: loss 0.0279 - lr: 0.000010 2023-10-13 09:39:55,846 DEV : loss 0.17587348818778992 - f1-score (micro avg) 0.8578 2023-10-13 09:39:55,856 saving best model 2023-10-13 09:39:56,414 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:39:57,308 epoch 8 - iter 15/152 - loss 0.00929224 - time (sec): 0.89 - samples/sec: 3390.65 - lr: 0.000010 - momentum: 0.000000 2023-10-13 09:39:58,199 epoch 8 - iter 30/152 - loss 0.01942479 - time (sec): 1.78 - samples/sec: 3496.14 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:39:59,113 epoch 8 - iter 45/152 - loss 0.02860277 - time (sec): 2.70 - samples/sec: 3418.84 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:40:00,029 epoch 8 - iter 60/152 - loss 0.02366104 - time (sec): 3.61 - samples/sec: 3382.91 - lr: 0.000009 - momentum: 0.000000 2023-10-13 09:40:01,282 epoch 8 - iter 75/152 - loss 0.02506110 - time (sec): 4.87 - samples/sec: 3182.13 - lr: 0.000008 - momentum: 0.000000 2023-10-13 09:40:02,159 epoch 8 - iter 90/152 - loss 0.02244323 - time (sec): 5.74 - samples/sec: 3219.10 - lr: 0.000008 - momentum: 0.000000 2023-10-13 09:40:03,087 epoch 8 - iter 105/152 - loss 0.01991333 - time (sec): 6.67 - samples/sec: 3198.93 - lr: 0.000008 - momentum: 0.000000 2023-10-13 09:40:03,961 epoch 8 - iter 120/152 - loss 0.01927426 - time (sec): 7.55 - samples/sec: 3222.51 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:40:04,849 epoch 8 - iter 135/152 - loss 0.02106969 - time (sec): 8.43 - samples/sec: 3245.87 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:40:05,727 epoch 8 - iter 150/152 - loss 0.02042280 - time (sec): 9.31 - samples/sec: 3285.49 - lr: 0.000007 - momentum: 0.000000 2023-10-13 09:40:05,838 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:40:05,838 EPOCH 8 done: loss 0.0208 - lr: 0.000007 2023-10-13 09:40:06,788 DEV : loss 0.1839696615934372 - f1-score (micro avg) 0.8314 2023-10-13 09:40:06,795 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:40:07,588 epoch 9 - iter 15/152 - loss 0.01914733 - time (sec): 0.79 - samples/sec: 3519.45 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:40:08,428 epoch 9 - iter 30/152 - loss 0.01898946 - time (sec): 1.63 - samples/sec: 3581.06 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:40:09,308 epoch 9 - iter 45/152 - loss 0.01957067 - time (sec): 2.51 - samples/sec: 3656.80 - lr: 0.000006 - momentum: 0.000000 2023-10-13 09:40:10,079 epoch 9 - iter 60/152 - loss 0.02357565 - time (sec): 3.28 - samples/sec: 3613.09 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:40:10,909 epoch 9 - iter 75/152 - loss 0.02351370 - time (sec): 4.11 - samples/sec: 3597.67 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:40:11,781 epoch 9 - iter 90/152 - loss 0.02277609 - time (sec): 4.98 - samples/sec: 3631.28 - lr: 0.000005 - momentum: 0.000000 2023-10-13 09:40:12,645 epoch 9 - iter 105/152 - loss 0.02045437 - time (sec): 5.85 - samples/sec: 3684.14 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:40:13,459 epoch 9 - iter 120/152 - loss 0.01842170 - time (sec): 6.66 - samples/sec: 3667.39 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:40:14,276 epoch 9 - iter 135/152 - loss 0.01671679 - time (sec): 7.48 - samples/sec: 3679.89 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:40:15,167 epoch 9 - iter 150/152 - loss 0.01739569 - time (sec): 8.37 - samples/sec: 3660.04 - lr: 0.000004 - momentum: 0.000000 2023-10-13 09:40:15,270 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:40:15,270 EPOCH 9 done: loss 0.0172 - lr: 0.000004 2023-10-13 09:40:16,265 DEV : loss 0.1881193369626999 - f1-score (micro avg) 0.8551 2023-10-13 09:40:16,272 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:40:17,186 epoch 10 - iter 15/152 - loss 0.00395892 - time (sec): 0.91 - samples/sec: 3317.23 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:40:18,087 epoch 10 - iter 30/152 - loss 0.00446924 - time (sec): 1.81 - samples/sec: 3255.88 - lr: 0.000003 - momentum: 0.000000 2023-10-13 09:40:18,942 epoch 10 - iter 45/152 - loss 0.00768769 - time (sec): 2.67 - samples/sec: 3275.31 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:40:19,876 epoch 10 - iter 60/152 - loss 0.00679437 - time (sec): 3.60 - samples/sec: 3294.41 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:40:20,763 epoch 10 - iter 75/152 - loss 0.01237748 - time (sec): 4.49 - samples/sec: 3349.85 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:40:21,713 epoch 10 - iter 90/152 - loss 0.01043566 - time (sec): 5.44 - samples/sec: 3372.04 - lr: 0.000002 - momentum: 0.000000 2023-10-13 09:40:22,637 epoch 10 - iter 105/152 - loss 0.01088903 - time (sec): 6.36 - samples/sec: 3355.91 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:40:23,545 epoch 10 - iter 120/152 - loss 0.01417993 - time (sec): 7.27 - samples/sec: 3350.91 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:40:24,459 epoch 10 - iter 135/152 - loss 0.01419198 - time (sec): 8.19 - samples/sec: 3331.40 - lr: 0.000001 - momentum: 0.000000 2023-10-13 09:40:25,435 epoch 10 - iter 150/152 - loss 0.01387484 - time (sec): 9.16 - samples/sec: 3344.38 - lr: 0.000000 - momentum: 0.000000 2023-10-13 09:40:25,550 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:40:25,550 EPOCH 10 done: loss 0.0137 - lr: 0.000000 2023-10-13 09:40:26,517 DEV : loss 0.1888277232646942 - f1-score (micro avg) 0.8571 2023-10-13 09:40:26,973 ---------------------------------------------------------------------------------------------------- 2023-10-13 09:40:26,975 Loading model from best epoch ... 2023-10-13 09:40:28,555 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-date, B-date, E-date, I-date, S-object, B-object, E-object, I-object 2023-10-13 09:40:29,750 Results: - F-score (micro) 0.7839 - F-score (macro) 0.6335 - Accuracy 0.6503 By class: precision recall f1-score support scope 0.7547 0.7947 0.7742 151 work 0.6860 0.8737 0.7685 95 pers 0.7565 0.9062 0.8246 96 loc 1.0000 0.6667 0.8000 3 date 0.0000 0.0000 0.0000 3 micro avg 0.7355 0.8391 0.7839 348 macro avg 0.6394 0.6483 0.6335 348 weighted avg 0.7321 0.8391 0.7801 348 2023-10-13 09:40:29,750 ----------------------------------------------------------------------------------------------------