2023-10-27 18:44:31,683 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,685 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): XLMRobertaModel( (embeddings): XLMRobertaEmbeddings( (word_embeddings): Embedding(250003, 1024) (position_embeddings): Embedding(514, 1024, padding_idx=1) (token_type_embeddings): Embedding(1, 1024) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): XLMRobertaEncoder( (layer): ModuleList( (0-23): 24 x XLMRobertaLayer( (attention): XLMRobertaAttention( (self): XLMRobertaSelfAttention( (query): Linear(in_features=1024, out_features=1024, bias=True) (key): Linear(in_features=1024, out_features=1024, bias=True) (value): Linear(in_features=1024, out_features=1024, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): XLMRobertaSelfOutput( (dense): Linear(in_features=1024, out_features=1024, bias=True) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): XLMRobertaIntermediate( (dense): Linear(in_features=1024, out_features=4096, bias=True) (intermediate_act_fn): GELUActivation() ) (output): XLMRobertaOutput( (dense): Linear(in_features=4096, out_features=1024, bias=True) (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): XLMRobertaPooler( (dense): Linear(in_features=1024, out_features=1024, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=1024, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-27 18:44:31,685 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,685 Corpus: 14903 train + 3449 dev + 3658 test sentences 2023-10-27 18:44:31,685 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,685 Train: 14903 sentences 2023-10-27 18:44:31,685 (train_with_dev=False, train_with_test=False) 2023-10-27 18:44:31,685 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,685 Training Params: 2023-10-27 18:44:31,685 - learning_rate: "5e-06" 2023-10-27 18:44:31,685 - mini_batch_size: "4" 2023-10-27 18:44:31,685 - max_epochs: "10" 2023-10-27 18:44:31,685 - shuffle: "True" 2023-10-27 18:44:31,685 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,685 Plugins: 2023-10-27 18:44:31,685 - TensorboardLogger 2023-10-27 18:44:31,686 - LinearScheduler | warmup_fraction: '0.1' 2023-10-27 18:44:31,686 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,686 Final evaluation on model from best epoch (best-model.pt) 2023-10-27 18:44:31,686 - metric: "('micro avg', 'f1-score')" 2023-10-27 18:44:31,686 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,686 Computation: 2023-10-27 18:44:31,686 - compute on device: cuda:0 2023-10-27 18:44:31,686 - embedding storage: none 2023-10-27 18:44:31,686 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,686 Model training base path: "flair-clean-conll-lr5e-06-bs4-4" 2023-10-27 18:44:31,686 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,686 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:44:31,686 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-27 18:45:17,500 epoch 1 - iter 372/3726 - loss 2.56008396 - time (sec): 45.81 - samples/sec: 437.21 - lr: 0.000000 - momentum: 0.000000 2023-10-27 18:46:02,660 epoch 1 - iter 744/3726 - loss 1.77263965 - time (sec): 90.97 - samples/sec: 440.01 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:46:48,640 epoch 1 - iter 1116/3726 - loss 1.37087487 - time (sec): 136.95 - samples/sec: 443.33 - lr: 0.000001 - momentum: 0.000000 2023-10-27 18:47:34,220 epoch 1 - iter 1488/3726 - loss 1.13762799 - time (sec): 182.53 - samples/sec: 445.61 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:48:19,175 epoch 1 - iter 1860/3726 - loss 0.97055504 - time (sec): 227.49 - samples/sec: 446.76 - lr: 0.000002 - momentum: 0.000000 2023-10-27 18:49:04,645 epoch 1 - iter 2232/3726 - loss 0.83989727 - time (sec): 272.96 - samples/sec: 449.00 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:49:49,378 epoch 1 - iter 2604/3726 - loss 0.74197502 - time (sec): 317.69 - samples/sec: 450.38 - lr: 0.000003 - momentum: 0.000000 2023-10-27 18:50:34,930 epoch 1 - iter 2976/3726 - loss 0.66837341 - time (sec): 363.24 - samples/sec: 449.48 - lr: 0.000004 - momentum: 0.000000 2023-10-27 18:51:20,133 epoch 1 - iter 3348/3726 - loss 0.60924784 - time (sec): 408.45 - samples/sec: 448.31 - lr: 0.000004 - momentum: 0.000000 2023-10-27 18:52:08,456 epoch 1 - iter 3720/3726 - loss 0.55515076 - time (sec): 456.77 - samples/sec: 447.31 - lr: 0.000005 - momentum: 0.000000 2023-10-27 18:52:09,145 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:52:09,145 EPOCH 1 done: loss 0.5546 - lr: 0.000005 2023-10-27 18:52:30,774 DEV : loss 0.10047101974487305 - f1-score (micro avg) 0.931 2023-10-27 18:52:30,829 saving best model 2023-10-27 18:52:33,017 ---------------------------------------------------------------------------------------------------- 2023-10-27 18:53:18,587 epoch 2 - iter 372/3726 - loss 0.11546777 - time (sec): 45.57 - samples/sec: 460.22 - lr: 0.000005 - momentum: 0.000000 2023-10-27 18:54:05,289 epoch 2 - iter 744/3726 - loss 0.10660754 - time (sec): 92.27 - samples/sec: 449.30 - lr: 0.000005 - momentum: 0.000000 2023-10-27 18:54:51,000 epoch 2 - iter 1116/3726 - loss 0.10038157 - time (sec): 137.98 - samples/sec: 455.34 - lr: 0.000005 - momentum: 0.000000 2023-10-27 18:55:36,869 epoch 2 - iter 1488/3726 - loss 0.09684308 - time (sec): 183.85 - samples/sec: 451.24 - lr: 0.000005 - momentum: 0.000000 2023-10-27 18:56:23,094 epoch 2 - iter 1860/3726 - loss 0.09806463 - time (sec): 230.07 - samples/sec: 450.98 - lr: 0.000005 - momentum: 0.000000 2023-10-27 18:57:08,793 epoch 2 - iter 2232/3726 - loss 0.09423182 - time (sec): 275.77 - samples/sec: 450.87 - lr: 0.000005 - momentum: 0.000000 2023-10-27 18:57:53,912 epoch 2 - iter 2604/3726 - loss 0.09270500 - time (sec): 320.89 - samples/sec: 452.25 - lr: 0.000005 - momentum: 0.000000 2023-10-27 18:58:39,792 epoch 2 - iter 2976/3726 - loss 0.08995641 - time (sec): 366.77 - samples/sec: 451.59 - lr: 0.000005 - momentum: 0.000000 2023-10-27 18:59:26,428 epoch 2 - iter 3348/3726 - loss 0.08720003 - time (sec): 413.41 - samples/sec: 447.43 - lr: 0.000005 - momentum: 0.000000 2023-10-27 19:00:11,611 epoch 2 - iter 3720/3726 - loss 0.08576854 - time (sec): 458.59 - samples/sec: 445.51 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:00:12,346 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:00:12,347 EPOCH 2 done: loss 0.0857 - lr: 0.000004 2023-10-27 19:00:35,384 DEV : loss 0.06790720671415329 - f1-score (micro avg) 0.9571 2023-10-27 19:00:35,438 saving best model 2023-10-27 19:00:37,853 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:01:23,535 epoch 3 - iter 372/3726 - loss 0.07374127 - time (sec): 45.68 - samples/sec: 439.85 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:02:08,846 epoch 3 - iter 744/3726 - loss 0.06087161 - time (sec): 90.99 - samples/sec: 440.33 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:02:54,176 epoch 3 - iter 1116/3726 - loss 0.05718760 - time (sec): 136.32 - samples/sec: 444.62 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:03:40,248 epoch 3 - iter 1488/3726 - loss 0.05400073 - time (sec): 182.39 - samples/sec: 447.02 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:04:26,095 epoch 3 - iter 1860/3726 - loss 0.05589718 - time (sec): 228.24 - samples/sec: 447.08 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:05:12,308 epoch 3 - iter 2232/3726 - loss 0.05415602 - time (sec): 274.45 - samples/sec: 445.29 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:05:58,644 epoch 3 - iter 2604/3726 - loss 0.05279672 - time (sec): 320.79 - samples/sec: 445.89 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:06:44,498 epoch 3 - iter 2976/3726 - loss 0.05148716 - time (sec): 366.64 - samples/sec: 446.33 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:07:29,887 epoch 3 - iter 3348/3726 - loss 0.05136170 - time (sec): 412.03 - samples/sec: 446.61 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:08:16,632 epoch 3 - iter 3720/3726 - loss 0.05145199 - time (sec): 458.78 - samples/sec: 445.53 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:08:17,367 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:08:17,367 EPOCH 3 done: loss 0.0514 - lr: 0.000004 2023-10-27 19:08:39,846 DEV : loss 0.052444763481616974 - f1-score (micro avg) 0.9626 2023-10-27 19:08:39,896 saving best model 2023-10-27 19:08:42,267 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:09:28,802 epoch 4 - iter 372/3726 - loss 0.03871254 - time (sec): 46.53 - samples/sec: 454.07 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:10:14,349 epoch 4 - iter 744/3726 - loss 0.04116213 - time (sec): 92.08 - samples/sec: 449.79 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:11:00,008 epoch 4 - iter 1116/3726 - loss 0.04020488 - time (sec): 137.74 - samples/sec: 454.42 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:11:45,448 epoch 4 - iter 1488/3726 - loss 0.04174189 - time (sec): 183.18 - samples/sec: 445.99 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:12:31,270 epoch 4 - iter 1860/3726 - loss 0.04211643 - time (sec): 229.00 - samples/sec: 444.26 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:13:16,951 epoch 4 - iter 2232/3726 - loss 0.04082069 - time (sec): 274.68 - samples/sec: 443.61 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:14:02,493 epoch 4 - iter 2604/3726 - loss 0.03901976 - time (sec): 320.22 - samples/sec: 446.31 - lr: 0.000004 - momentum: 0.000000 2023-10-27 19:14:48,002 epoch 4 - iter 2976/3726 - loss 0.03797498 - time (sec): 365.73 - samples/sec: 446.53 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:15:33,644 epoch 4 - iter 3348/3726 - loss 0.03778630 - time (sec): 411.37 - samples/sec: 445.67 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:16:19,632 epoch 4 - iter 3720/3726 - loss 0.03709369 - time (sec): 457.36 - samples/sec: 446.74 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:16:20,375 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:16:20,375 EPOCH 4 done: loss 0.0372 - lr: 0.000003 2023-10-27 19:16:43,692 DEV : loss 0.05450737103819847 - f1-score (micro avg) 0.9647 2023-10-27 19:16:43,745 saving best model 2023-10-27 19:16:46,767 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:17:32,579 epoch 5 - iter 372/3726 - loss 0.02845766 - time (sec): 45.81 - samples/sec: 459.66 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:18:18,467 epoch 5 - iter 744/3726 - loss 0.02421414 - time (sec): 91.70 - samples/sec: 448.91 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:19:04,523 epoch 5 - iter 1116/3726 - loss 0.02470444 - time (sec): 137.75 - samples/sec: 447.01 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:19:50,020 epoch 5 - iter 1488/3726 - loss 0.02877377 - time (sec): 183.25 - samples/sec: 446.70 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:20:35,406 epoch 5 - iter 1860/3726 - loss 0.02956343 - time (sec): 228.64 - samples/sec: 445.95 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:21:21,375 epoch 5 - iter 2232/3726 - loss 0.02911303 - time (sec): 274.61 - samples/sec: 444.00 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:22:06,761 epoch 5 - iter 2604/3726 - loss 0.02983439 - time (sec): 319.99 - samples/sec: 446.51 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:22:52,640 epoch 5 - iter 2976/3726 - loss 0.02953542 - time (sec): 365.87 - samples/sec: 447.29 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:23:37,972 epoch 5 - iter 3348/3726 - loss 0.02932321 - time (sec): 411.20 - samples/sec: 448.46 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:24:24,955 epoch 5 - iter 3720/3726 - loss 0.02916844 - time (sec): 458.19 - samples/sec: 445.99 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:24:25,687 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:24:25,688 EPOCH 5 done: loss 0.0292 - lr: 0.000003 2023-10-27 19:24:48,031 DEV : loss 0.05429258942604065 - f1-score (micro avg) 0.9647 2023-10-27 19:24:48,084 saving best model 2023-10-27 19:24:50,724 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:25:36,994 epoch 6 - iter 372/3726 - loss 0.01703796 - time (sec): 46.27 - samples/sec: 456.14 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:26:22,882 epoch 6 - iter 744/3726 - loss 0.02169011 - time (sec): 92.16 - samples/sec: 454.83 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:27:07,925 epoch 6 - iter 1116/3726 - loss 0.02127737 - time (sec): 137.20 - samples/sec: 451.49 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:27:53,410 epoch 6 - iter 1488/3726 - loss 0.02157007 - time (sec): 182.68 - samples/sec: 452.54 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:28:39,521 epoch 6 - iter 1860/3726 - loss 0.02169603 - time (sec): 228.79 - samples/sec: 449.67 - lr: 0.000003 - momentum: 0.000000 2023-10-27 19:29:24,934 epoch 6 - iter 2232/3726 - loss 0.02182002 - time (sec): 274.21 - samples/sec: 451.13 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:30:11,317 epoch 6 - iter 2604/3726 - loss 0.02225228 - time (sec): 320.59 - samples/sec: 448.74 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:30:56,485 epoch 6 - iter 2976/3726 - loss 0.02130458 - time (sec): 365.76 - samples/sec: 450.41 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:31:42,001 epoch 6 - iter 3348/3726 - loss 0.02151336 - time (sec): 411.28 - samples/sec: 448.25 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:32:28,009 epoch 6 - iter 3720/3726 - loss 0.02092486 - time (sec): 457.28 - samples/sec: 446.72 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:32:28,753 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:32:28,753 EPOCH 6 done: loss 0.0209 - lr: 0.000002 2023-10-27 19:32:52,231 DEV : loss 0.052240729331970215 - f1-score (micro avg) 0.9692 2023-10-27 19:32:52,286 saving best model 2023-10-27 19:32:55,357 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:33:41,068 epoch 7 - iter 372/3726 - loss 0.02108834 - time (sec): 45.71 - samples/sec: 450.92 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:34:26,441 epoch 7 - iter 744/3726 - loss 0.01802067 - time (sec): 91.08 - samples/sec: 447.77 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:35:12,147 epoch 7 - iter 1116/3726 - loss 0.01632005 - time (sec): 136.79 - samples/sec: 445.34 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:35:57,225 epoch 7 - iter 1488/3726 - loss 0.01518131 - time (sec): 181.87 - samples/sec: 444.13 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:36:42,626 epoch 7 - iter 1860/3726 - loss 0.01620599 - time (sec): 227.27 - samples/sec: 445.22 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:37:28,781 epoch 7 - iter 2232/3726 - loss 0.01630849 - time (sec): 273.42 - samples/sec: 446.32 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:38:14,783 epoch 7 - iter 2604/3726 - loss 0.01643476 - time (sec): 319.42 - samples/sec: 447.07 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:39:00,647 epoch 7 - iter 2976/3726 - loss 0.01586704 - time (sec): 365.29 - samples/sec: 447.99 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:39:46,421 epoch 7 - iter 3348/3726 - loss 0.01547939 - time (sec): 411.06 - samples/sec: 448.19 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:40:32,073 epoch 7 - iter 3720/3726 - loss 0.01513403 - time (sec): 456.71 - samples/sec: 447.41 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:40:32,785 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:40:32,785 EPOCH 7 done: loss 0.0151 - lr: 0.000002 2023-10-27 19:40:55,727 DEV : loss 0.050821226090192795 - f1-score (micro avg) 0.9714 2023-10-27 19:40:55,781 saving best model 2023-10-27 19:40:58,713 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:41:44,020 epoch 8 - iter 372/3726 - loss 0.01532970 - time (sec): 45.30 - samples/sec: 450.04 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:42:29,890 epoch 8 - iter 744/3726 - loss 0.01208467 - time (sec): 91.18 - samples/sec: 452.34 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:43:16,086 epoch 8 - iter 1116/3726 - loss 0.01337061 - time (sec): 137.37 - samples/sec: 446.08 - lr: 0.000002 - momentum: 0.000000 2023-10-27 19:44:02,334 epoch 8 - iter 1488/3726 - loss 0.01184764 - time (sec): 183.62 - samples/sec: 444.41 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:44:48,481 epoch 8 - iter 1860/3726 - loss 0.01162265 - time (sec): 229.77 - samples/sec: 444.22 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:45:33,897 epoch 8 - iter 2232/3726 - loss 0.01116723 - time (sec): 275.18 - samples/sec: 447.92 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:46:19,563 epoch 8 - iter 2604/3726 - loss 0.01237156 - time (sec): 320.85 - samples/sec: 448.45 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:47:06,257 epoch 8 - iter 2976/3726 - loss 0.01216515 - time (sec): 367.54 - samples/sec: 445.88 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:47:51,860 epoch 8 - iter 3348/3726 - loss 0.01228458 - time (sec): 413.15 - samples/sec: 445.41 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:48:37,302 epoch 8 - iter 3720/3726 - loss 0.01173645 - time (sec): 458.59 - samples/sec: 445.55 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:48:38,037 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:48:38,038 EPOCH 8 done: loss 0.0117 - lr: 0.000001 2023-10-27 19:49:01,125 DEV : loss 0.05222569778561592 - f1-score (micro avg) 0.9726 2023-10-27 19:49:01,178 saving best model 2023-10-27 19:49:03,903 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:49:49,349 epoch 9 - iter 372/3726 - loss 0.00486713 - time (sec): 45.44 - samples/sec: 455.66 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:50:34,984 epoch 9 - iter 744/3726 - loss 0.00545741 - time (sec): 91.08 - samples/sec: 453.89 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:51:20,189 epoch 9 - iter 1116/3726 - loss 0.00787820 - time (sec): 136.28 - samples/sec: 452.07 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:52:05,623 epoch 9 - iter 1488/3726 - loss 0.00691940 - time (sec): 181.72 - samples/sec: 451.59 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:52:51,095 epoch 9 - iter 1860/3726 - loss 0.00761910 - time (sec): 227.19 - samples/sec: 452.40 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:53:36,717 epoch 9 - iter 2232/3726 - loss 0.00762123 - time (sec): 272.81 - samples/sec: 450.93 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:54:23,451 epoch 9 - iter 2604/3726 - loss 0.00784988 - time (sec): 319.55 - samples/sec: 448.92 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:55:08,672 epoch 9 - iter 2976/3726 - loss 0.00844099 - time (sec): 364.77 - samples/sec: 449.81 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:55:53,980 epoch 9 - iter 3348/3726 - loss 0.00836439 - time (sec): 410.07 - samples/sec: 449.23 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:56:39,471 epoch 9 - iter 3720/3726 - loss 0.00807807 - time (sec): 455.57 - samples/sec: 448.18 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:56:40,233 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:56:40,233 EPOCH 9 done: loss 0.0081 - lr: 0.000001 2023-10-27 19:57:03,898 DEV : loss 0.05321481078863144 - f1-score (micro avg) 0.9742 2023-10-27 19:57:03,949 saving best model 2023-10-27 19:57:06,562 ---------------------------------------------------------------------------------------------------- 2023-10-27 19:57:52,150 epoch 10 - iter 372/3726 - loss 0.00368318 - time (sec): 45.59 - samples/sec: 447.26 - lr: 0.000001 - momentum: 0.000000 2023-10-27 19:58:37,058 epoch 10 - iter 744/3726 - loss 0.00490381 - time (sec): 90.49 - samples/sec: 441.64 - lr: 0.000000 - momentum: 0.000000 2023-10-27 19:59:22,616 epoch 10 - iter 1116/3726 - loss 0.00591526 - time (sec): 136.05 - samples/sec: 444.19 - lr: 0.000000 - momentum: 0.000000 2023-10-27 20:00:08,343 epoch 10 - iter 1488/3726 - loss 0.00634553 - time (sec): 181.78 - samples/sec: 443.45 - lr: 0.000000 - momentum: 0.000000 2023-10-27 20:00:54,241 epoch 10 - iter 1860/3726 - loss 0.00584248 - time (sec): 227.68 - samples/sec: 441.45 - lr: 0.000000 - momentum: 0.000000 2023-10-27 20:01:40,141 epoch 10 - iter 2232/3726 - loss 0.00572180 - time (sec): 273.58 - samples/sec: 443.18 - lr: 0.000000 - momentum: 0.000000 2023-10-27 20:02:25,686 epoch 10 - iter 2604/3726 - loss 0.00598793 - time (sec): 319.12 - samples/sec: 445.80 - lr: 0.000000 - momentum: 0.000000 2023-10-27 20:03:11,659 epoch 10 - iter 2976/3726 - loss 0.00597232 - time (sec): 365.09 - samples/sec: 445.70 - lr: 0.000000 - momentum: 0.000000 2023-10-27 20:03:57,027 epoch 10 - iter 3348/3726 - loss 0.00599237 - time (sec): 410.46 - samples/sec: 447.67 - lr: 0.000000 - momentum: 0.000000 2023-10-27 20:04:42,462 epoch 10 - iter 3720/3726 - loss 0.00636112 - time (sec): 455.90 - samples/sec: 448.08 - lr: 0.000000 - momentum: 0.000000 2023-10-27 20:04:43,220 ---------------------------------------------------------------------------------------------------- 2023-10-27 20:04:43,220 EPOCH 10 done: loss 0.0064 - lr: 0.000000 2023-10-27 20:05:06,157 DEV : loss 0.05384046211838722 - f1-score (micro avg) 0.9737 2023-10-27 20:05:08,560 ---------------------------------------------------------------------------------------------------- 2023-10-27 20:05:08,562 Loading model from best epoch ... 2023-10-27 20:05:16,264 SequenceTagger predicts: Dictionary with 17 tags: O, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-MISC, B-MISC, E-MISC, I-MISC 2023-10-27 20:05:38,883 Results: - F-score (micro) 0.9696 - F-score (macro) 0.9645 - Accuracy 0.9555 By class: precision recall f1-score support ORG 0.9665 0.9665 0.9665 1909 PER 0.9956 0.9956 0.9956 1591 LOC 0.9723 0.9674 0.9698 1413 MISC 0.9117 0.9409 0.9261 812 micro avg 0.9680 0.9712 0.9696 5725 macro avg 0.9615 0.9676 0.9645 5725 weighted avg 0.9682 0.9712 0.9697 5725 2023-10-27 20:05:38,883 ----------------------------------------------------------------------------------------------------