2023-10-27 17:20:54,808 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,809 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): XLMRobertaModel( (embeddings): XLMRobertaEmbeddings( (word_embeddings): Embedding(250003, 1024) (position_embeddings): Embedding(514, 1024, padding_idx=1) (token_type_embeddings): Embedding(1, 1024) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): XLMRobertaEncoder( (layer): ModuleList( (0-23): 24 x XLMRobertaLayer( (attention): XLMRobertaAttention( (self): XLMRobertaSelfAttention( (query): Linear(in_features=1024, out_features=1024, bias=True) (key): Linear(in_features=1024, out_features=1024, bias=True) (value): Linear(in_features=1024, out_features=1024, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): XLMRobertaSelfOutput( (dense): Linear(in_features=1024, out_features=1024, bias=True) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): XLMRobertaIntermediate( (dense): Linear(in_features=1024, out_features=4096, bias=True) (intermediate_act_fn): GELUActivation() ) (output): XLMRobertaOutput( (dense): Linear(in_features=4096, out_features=1024, bias=True) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): XLMRobertaPooler( (dense): Linear(in_features=1024, out_features=1024, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=1024, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-27 17:20:54,809 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,809 Corpus: 14903 train + 3449 dev + 3658 test sentences 2023-10-27 17:20:54,809 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,809 Train: 14903 sentences 2023-10-27 17:20:54,809 (train_with_dev=False, train_with_test=False) 2023-10-27 17:20:54,809 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,809 Training Params: 2023-10-27 17:20:54,809 - learning_rate: "5e-06" 2023-10-27 17:20:54,809 - mini_batch_size: "4" 2023-10-27 17:20:54,809 - max_epochs: "10" 2023-10-27 17:20:54,809 - shuffle: "True" 2023-10-27 17:20:54,810 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,810 Plugins: 2023-10-27 17:20:54,810 - TensorboardLogger 2023-10-27 17:20:54,810 - LinearScheduler | warmup_fraction: '0.1' 2023-10-27 17:20:54,810 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,810 Final evaluation on model from best epoch (best-model.pt) 2023-10-27 17:20:54,810 - metric: "('micro avg', 'f1-score')" 2023-10-27 17:20:54,810 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,810 Computation: 2023-10-27 17:20:54,810 - compute on device: cuda:0 2023-10-27 17:20:54,810 - embedding storage: none 2023-10-27 17:20:54,810 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,810 Model training base path: "flair-clean-conll-lr5e-06-bs4-3" 2023-10-27 17:20:54,810 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,810 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:20:54,810 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-27 17:21:40,162 epoch 1 - iter 372/3726 - loss 2.98651987 - time (sec): 45.35 - samples/sec: 437.70 - lr: 0.000000 - momentum: 0.000000 2023-10-27 17:22:25,895 epoch 1 - iter 744/3726 - loss 1.95456152 - time (sec): 91.08 - samples/sec: 446.47 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:23:14,767 epoch 1 - iter 1116/3726 - loss 1.47436963 - time (sec): 139.96 - samples/sec: 438.22 - lr: 0.000001 - momentum: 0.000000 2023-10-27 17:24:00,286 epoch 1 - iter 1488/3726 - loss 1.21396490 - time (sec): 185.47 - samples/sec: 440.46 - lr: 0.000002 - momentum: 0.000000 2023-10-27 17:24:46,086 epoch 1 - iter 1860/3726 - loss 1.02707175 - time (sec): 231.27 - samples/sec: 442.97 - lr: 0.000002 - momentum: 0.000000 2023-10-27 17:25:31,827 epoch 1 - iter 2232/3726 - loss 0.89439809 - time (sec): 277.02 - samples/sec: 442.39 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:26:17,786 epoch 1 - iter 2604/3726 - loss 0.78674618 - time (sec): 322.97 - samples/sec: 443.93 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:27:04,447 epoch 1 - iter 2976/3726 - loss 0.70312379 - time (sec): 369.64 - samples/sec: 443.20 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:27:50,340 epoch 1 - iter 3348/3726 - loss 0.63740018 - time (sec): 415.53 - samples/sec: 442.57 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:28:36,141 epoch 1 - iter 3720/3726 - loss 0.58486957 - time (sec): 461.33 - samples/sec: 442.64 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:28:36,887 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:28:36,887 EPOCH 1 done: loss 0.5838 - lr: 0.000005 2023-10-27 17:28:59,836 DEV : loss 0.08319637179374695 - f1-score (micro avg) 0.9362 2023-10-27 17:28:59,887 saving best model 2023-10-27 17:29:02,186 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:29:48,303 epoch 2 - iter 372/3726 - loss 0.11238039 - time (sec): 46.11 - samples/sec: 434.33 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:30:34,246 epoch 2 - iter 744/3726 - loss 0.10205120 - time (sec): 92.06 - samples/sec: 440.16 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:31:20,589 epoch 2 - iter 1116/3726 - loss 0.09058251 - time (sec): 138.40 - samples/sec: 447.87 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:32:06,506 epoch 2 - iter 1488/3726 - loss 0.09156566 - time (sec): 184.32 - samples/sec: 443.90 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:32:52,277 epoch 2 - iter 1860/3726 - loss 0.08901480 - time (sec): 230.09 - samples/sec: 444.10 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:33:38,125 epoch 2 - iter 2232/3726 - loss 0.08767046 - time (sec): 275.94 - samples/sec: 440.49 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:34:23,826 epoch 2 - iter 2604/3726 - loss 0.08532145 - time (sec): 321.64 - samples/sec: 441.24 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:35:09,960 epoch 2 - iter 2976/3726 - loss 0.08526503 - time (sec): 367.77 - samples/sec: 442.87 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:35:56,636 epoch 2 - iter 3348/3726 - loss 0.08368925 - time (sec): 414.45 - samples/sec: 443.20 - lr: 0.000005 - momentum: 0.000000 2023-10-27 17:36:42,804 epoch 2 - iter 3720/3726 - loss 0.08396268 - time (sec): 460.62 - samples/sec: 443.41 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:36:43,519 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:36:43,519 EPOCH 2 done: loss 0.0838 - lr: 0.000004 2023-10-27 17:37:07,571 DEV : loss 0.06706252694129944 - f1-score (micro avg) 0.9574 2023-10-27 17:37:07,624 saving best model 2023-10-27 17:37:10,584 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:37:57,573 epoch 3 - iter 372/3726 - loss 0.04814695 - time (sec): 46.99 - samples/sec: 438.53 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:38:44,337 epoch 3 - iter 744/3726 - loss 0.04886821 - time (sec): 93.75 - samples/sec: 438.14 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:39:31,726 epoch 3 - iter 1116/3726 - loss 0.05014060 - time (sec): 141.14 - samples/sec: 435.62 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:40:18,792 epoch 3 - iter 1488/3726 - loss 0.05220008 - time (sec): 188.21 - samples/sec: 437.54 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:41:05,149 epoch 3 - iter 1860/3726 - loss 0.05148240 - time (sec): 234.56 - samples/sec: 437.23 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:41:52,025 epoch 3 - iter 2232/3726 - loss 0.05339505 - time (sec): 281.44 - samples/sec: 437.28 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:42:38,711 epoch 3 - iter 2604/3726 - loss 0.05374593 - time (sec): 328.12 - samples/sec: 438.62 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:43:25,619 epoch 3 - iter 2976/3726 - loss 0.05287703 - time (sec): 375.03 - samples/sec: 437.97 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:44:13,432 epoch 3 - iter 3348/3726 - loss 0.05256041 - time (sec): 422.85 - samples/sec: 435.76 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:45:00,348 epoch 3 - iter 3720/3726 - loss 0.05257701 - time (sec): 469.76 - samples/sec: 434.99 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:45:01,081 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:45:01,082 EPOCH 3 done: loss 0.0526 - lr: 0.000004 2023-10-27 17:45:25,642 DEV : loss 0.04900110512971878 - f1-score (micro avg) 0.9632 2023-10-27 17:45:25,698 saving best model 2023-10-27 17:45:28,617 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:46:15,914 epoch 4 - iter 372/3726 - loss 0.03649343 - time (sec): 47.29 - samples/sec: 421.21 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:47:02,348 epoch 4 - iter 744/3726 - loss 0.03904655 - time (sec): 93.73 - samples/sec: 428.89 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:47:50,016 epoch 4 - iter 1116/3726 - loss 0.03747173 - time (sec): 141.40 - samples/sec: 431.76 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:48:37,609 epoch 4 - iter 1488/3726 - loss 0.03962095 - time (sec): 188.99 - samples/sec: 432.03 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:49:24,478 epoch 4 - iter 1860/3726 - loss 0.03665861 - time (sec): 235.86 - samples/sec: 435.26 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:50:10,615 epoch 4 - iter 2232/3726 - loss 0.03744683 - time (sec): 282.00 - samples/sec: 436.00 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:50:56,634 epoch 4 - iter 2604/3726 - loss 0.03718038 - time (sec): 328.01 - samples/sec: 438.31 - lr: 0.000004 - momentum: 0.000000 2023-10-27 17:51:41,898 epoch 4 - iter 2976/3726 - loss 0.03558423 - time (sec): 373.28 - samples/sec: 440.32 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:52:27,923 epoch 4 - iter 3348/3726 - loss 0.03562024 - time (sec): 419.30 - samples/sec: 439.49 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:53:14,720 epoch 4 - iter 3720/3726 - loss 0.03527266 - time (sec): 466.10 - samples/sec: 438.45 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:53:15,487 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:53:15,487 EPOCH 4 done: loss 0.0353 - lr: 0.000003 2023-10-27 17:53:38,247 DEV : loss 0.05077873915433884 - f1-score (micro avg) 0.9689 2023-10-27 17:53:38,300 saving best model 2023-10-27 17:53:41,578 ---------------------------------------------------------------------------------------------------- 2023-10-27 17:54:28,406 epoch 5 - iter 372/3726 - loss 0.01635597 - time (sec): 46.83 - samples/sec: 431.15 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:55:14,040 epoch 5 - iter 744/3726 - loss 0.01995793 - time (sec): 92.46 - samples/sec: 438.55 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:56:00,008 epoch 5 - iter 1116/3726 - loss 0.02271135 - time (sec): 138.43 - samples/sec: 439.48 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:56:46,029 epoch 5 - iter 1488/3726 - loss 0.02370028 - time (sec): 184.45 - samples/sec: 439.76 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:57:32,057 epoch 5 - iter 1860/3726 - loss 0.02496095 - time (sec): 230.48 - samples/sec: 437.98 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:58:18,548 epoch 5 - iter 2232/3726 - loss 0.02420606 - time (sec): 276.97 - samples/sec: 436.18 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:59:05,818 epoch 5 - iter 2604/3726 - loss 0.02385058 - time (sec): 324.24 - samples/sec: 438.78 - lr: 0.000003 - momentum: 0.000000 2023-10-27 17:59:52,270 epoch 5 - iter 2976/3726 - loss 0.02471771 - time (sec): 370.69 - samples/sec: 439.50 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:00:39,106 epoch 5 - iter 3348/3726 - loss 0.02672304 - time (sec): 417.53 - samples/sec: 440.28 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:01:25,971 epoch 5 - iter 3720/3726 - loss 0.02642411 - time (sec): 464.39 - samples/sec: 439.82 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:01:26,740 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:01:26,740 EPOCH 5 done: loss 0.0264 - lr: 0.000003 2023-10-27 18:01:50,293 DEV : loss 0.05235698074102402 - f1-score (micro avg) 0.972 2023-10-27 18:01:50,346 saving best model 2023-10-27 18:01:53,254 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:02:39,752 epoch 6 - iter 372/3726 - loss 0.02916463 - time (sec): 46.49 - samples/sec: 446.96 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:03:25,626 epoch 6 - iter 744/3726 - loss 0.02452630 - time (sec): 92.37 - samples/sec: 442.98 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:04:11,837 epoch 6 - iter 1116/3726 - loss 0.02460461 - time (sec): 138.58 - samples/sec: 443.92 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:04:59,229 epoch 6 - iter 1488/3726 - loss 0.02344474 - time (sec): 185.97 - samples/sec: 441.02 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:05:46,701 epoch 6 - iter 1860/3726 - loss 0.02371111 - time (sec): 233.44 - samples/sec: 438.96 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:06:33,625 epoch 6 - iter 2232/3726 - loss 0.02288733 - time (sec): 280.37 - samples/sec: 438.13 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:07:19,620 epoch 6 - iter 2604/3726 - loss 0.02107152 - time (sec): 326.36 - samples/sec: 438.12 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:08:06,623 epoch 6 - iter 2976/3726 - loss 0.02064455 - time (sec): 373.36 - samples/sec: 437.04 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:08:52,249 epoch 6 - iter 3348/3726 - loss 0.02103691 - time (sec): 418.99 - samples/sec: 438.95 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:09:38,374 epoch 6 - iter 3720/3726 - loss 0.02102916 - time (sec): 465.12 - samples/sec: 439.25 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:09:39,127 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:09:39,128 EPOCH 6 done: loss 0.0211 - lr: 0.000002 2023-10-27 18:10:02,947 DEV : loss 0.05808666720986366 - f1-score (micro avg) 0.9682 2023-10-27 18:10:03,000 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:10:49,637 epoch 7 - iter 372/3726 - loss 0.01575730 - time (sec): 46.63 - samples/sec: 435.88 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:11:35,967 epoch 7 - iter 744/3726 - loss 0.01398724 - time (sec): 92.96 - samples/sec: 438.00 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:12:22,831 epoch 7 - iter 1116/3726 - loss 0.01313005 - time (sec): 139.83 - samples/sec: 442.95 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:13:08,773 epoch 7 - iter 1488/3726 - loss 0.01291165 - time (sec): 185.77 - samples/sec: 443.62 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:13:54,815 epoch 7 - iter 1860/3726 - loss 0.01295979 - time (sec): 231.81 - samples/sec: 441.98 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:14:40,721 epoch 7 - iter 2232/3726 - loss 0.01255139 - time (sec): 277.72 - samples/sec: 442.13 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:15:28,159 epoch 7 - iter 2604/3726 - loss 0.01200459 - time (sec): 325.16 - samples/sec: 439.39 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:16:14,038 epoch 7 - iter 2976/3726 - loss 0.01248980 - time (sec): 371.04 - samples/sec: 440.26 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:16:59,656 epoch 7 - iter 3348/3726 - loss 0.01321463 - time (sec): 416.65 - samples/sec: 441.16 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:17:45,100 epoch 7 - iter 3720/3726 - loss 0.01382182 - time (sec): 462.10 - samples/sec: 442.14 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:17:45,799 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:17:45,800 EPOCH 7 done: loss 0.0138 - lr: 0.000002 2023-10-27 18:18:08,992 DEV : loss 0.058880679309368134 - f1-score (micro avg) 0.9703 2023-10-27 18:18:09,048 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:18:55,063 epoch 8 - iter 372/3726 - loss 0.01490756 - time (sec): 46.01 - samples/sec: 448.34 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:19:41,086 epoch 8 - iter 744/3726 - loss 0.01079045 - time (sec): 92.03 - samples/sec: 439.19 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:20:27,084 epoch 8 - iter 1116/3726 - loss 0.01191125 - time (sec): 138.03 - samples/sec: 443.50 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:21:12,462 epoch 8 - iter 1488/3726 - loss 0.01100526 - time (sec): 183.41 - samples/sec: 451.17 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:21:58,283 epoch 8 - iter 1860/3726 - loss 0.01185326 - time (sec): 229.23 - samples/sec: 448.88 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:22:44,335 epoch 8 - iter 2232/3726 - loss 0.01176460 - time (sec): 275.28 - samples/sec: 445.95 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:23:30,785 epoch 8 - iter 2604/3726 - loss 0.01194894 - time (sec): 321.73 - samples/sec: 442.88 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:24:17,599 epoch 8 - iter 2976/3726 - loss 0.01190633 - time (sec): 368.55 - samples/sec: 441.32 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:25:04,773 epoch 8 - iter 3348/3726 - loss 0.01189745 - time (sec): 415.72 - samples/sec: 441.59 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:25:51,526 epoch 8 - iter 3720/3726 - loss 0.01191728 - time (sec): 462.48 - samples/sec: 441.51 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:25:52,333 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:25:52,333 EPOCH 8 done: loss 0.0120 - lr: 0.000001 2023-10-27 18:26:16,427 DEV : loss 0.05278489366173744 - f1-score (micro avg) 0.9741 2023-10-27 18:26:16,480 saving best model 2023-10-27 18:26:19,470 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:27:05,544 epoch 9 - iter 372/3726 - loss 0.00965068 - time (sec): 46.07 - samples/sec: 441.62 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:27:51,054 epoch 9 - iter 744/3726 - loss 0.00782923 - time (sec): 91.58 - samples/sec: 445.46 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:28:36,410 epoch 9 - iter 1116/3726 - loss 0.00714565 - time (sec): 136.94 - samples/sec: 450.28 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:29:22,141 epoch 9 - iter 1488/3726 - loss 0.00788002 - time (sec): 182.67 - samples/sec: 451.72 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:30:08,759 epoch 9 - iter 1860/3726 - loss 0.00817722 - time (sec): 229.29 - samples/sec: 448.24 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:30:54,418 epoch 9 - iter 2232/3726 - loss 0.00812347 - time (sec): 274.95 - samples/sec: 446.63 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:31:40,019 epoch 9 - iter 2604/3726 - loss 0.00818017 - time (sec): 320.55 - samples/sec: 446.97 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:32:25,812 epoch 9 - iter 2976/3726 - loss 0.00800987 - time (sec): 366.34 - samples/sec: 445.32 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:33:11,863 epoch 9 - iter 3348/3726 - loss 0.00815904 - time (sec): 412.39 - samples/sec: 445.22 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:33:57,717 epoch 9 - iter 3720/3726 - loss 0.00780421 - time (sec): 458.24 - samples/sec: 445.83 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:33:58,426 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:33:58,426 EPOCH 9 done: loss 0.0078 - lr: 0.000001 2023-10-27 18:34:21,819 DEV : loss 0.05219843238592148 - f1-score (micro avg) 0.9766 2023-10-27 18:34:21,872 saving best model 2023-10-27 18:34:24,714 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:35:10,773 epoch 10 - iter 372/3726 - loss 0.00735364 - time (sec): 46.06 - samples/sec: 439.89 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:35:55,957 epoch 10 - iter 744/3726 - loss 0.00724933 - time (sec): 91.24 - samples/sec: 448.71 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:36:41,613 epoch 10 - iter 1116/3726 - loss 0.00540230 - time (sec): 136.90 - samples/sec: 451.22 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:37:26,965 epoch 10 - iter 1488/3726 - loss 0.00604023 - time (sec): 182.25 - samples/sec: 453.12 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:38:12,558 epoch 10 - iter 1860/3726 - loss 0.00646998 - time (sec): 227.84 - samples/sec: 452.95 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:38:58,762 epoch 10 - iter 2232/3726 - loss 0.00621628 - time (sec): 274.05 - samples/sec: 449.84 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:39:45,050 epoch 10 - iter 2604/3726 - loss 0.00611127 - time (sec): 320.33 - samples/sec: 447.82 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:40:30,686 epoch 10 - iter 2976/3726 - loss 0.00609183 - time (sec): 365.97 - samples/sec: 447.92 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:41:16,049 epoch 10 - iter 3348/3726 - loss 0.00601238 - time (sec): 411.33 - samples/sec: 447.77 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:42:01,610 epoch 10 - iter 3720/3726 - loss 0.00601111 - time (sec): 456.89 - samples/sec: 447.20 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:42:02,342 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:42:02,343 EPOCH 10 done: loss 0.0060 - lr: 0.000000 2023-10-27 18:42:25,659 DEV : loss 0.05048835650086403 - f1-score (micro avg) 0.9762 2023-10-27 18:42:28,164 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:42:28,165 Loading model from best epoch ... 2023-10-27 18:42:36,107 SequenceTagger predicts: Dictionary with 17 tags: O, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-MISC, B-MISC, E-MISC, I-MISC 2023-10-27 18:42:59,035 Results: - F-score (micro) 0.9702 - F-score (macro) 0.9649 - Accuracy 0.956 By class: precision recall f1-score support ORG 0.9623 0.9749 0.9685 1909 PER 0.9962 0.9950 0.9956 1591 LOC 0.9729 0.9660 0.9695 1413 MISC 0.9178 0.9347 0.9262 812 micro avg 0.9678 0.9726 0.9702 5725 macro avg 0.9623 0.9676 0.9649 5725 weighted avg 0.9680 0.9726 0.9703 5725 2023-10-27 18:42:59,036 ----------------------------------------------------------------------------------------------------