2023-10-17 09:50:37,113 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,114 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 09:50:37,114 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,115 MultiCorpus: 6183 train + 680 dev + 2113 test sentences - NER_HIPE_2022 Corpus: 6183 train + 680 dev + 2113 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/topres19th/en/with_doc_seperator 2023-10-17 09:50:37,115 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,115 Train: 6183 sentences 2023-10-17 09:50:37,115 (train_with_dev=False, train_with_test=False) 2023-10-17 09:50:37,115 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,115 Training Params: 2023-10-17 09:50:37,115 - learning_rate: "5e-05" 2023-10-17 09:50:37,115 - mini_batch_size: "8" 2023-10-17 09:50:37,115 - max_epochs: "10" 2023-10-17 09:50:37,115 - shuffle: "True" 2023-10-17 09:50:37,115 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,116 Plugins: 2023-10-17 09:50:37,116 - TensorboardLogger 2023-10-17 09:50:37,116 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 09:50:37,116 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,116 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 09:50:37,116 - metric: "('micro avg', 'f1-score')" 2023-10-17 09:50:37,116 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,116 Computation: 2023-10-17 09:50:37,116 - compute on device: cuda:0 2023-10-17 09:50:37,116 - embedding storage: none 2023-10-17 09:50:37,116 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,116 Model training base path: "hmbench-topres19th/en-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 09:50:37,116 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,116 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:50:37,117 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 09:50:44,240 epoch 1 - iter 77/773 - loss 1.99803994 - time (sec): 7.12 - samples/sec: 1805.03 - lr: 0.000005 - momentum: 0.000000 2023-10-17 09:50:51,561 epoch 1 - iter 154/773 - loss 1.14101515 - time (sec): 14.44 - samples/sec: 1736.92 - lr: 0.000010 - momentum: 0.000000 2023-10-17 09:50:58,592 epoch 1 - iter 231/773 - loss 0.81080166 - time (sec): 21.47 - samples/sec: 1742.53 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:51:05,952 epoch 1 - iter 308/773 - loss 0.63449031 - time (sec): 28.83 - samples/sec: 1748.52 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:51:13,243 epoch 1 - iter 385/773 - loss 0.53077408 - time (sec): 36.12 - samples/sec: 1734.04 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:51:20,423 epoch 1 - iter 462/773 - loss 0.46116490 - time (sec): 43.31 - samples/sec: 1730.19 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:51:28,021 epoch 1 - iter 539/773 - loss 0.41868646 - time (sec): 50.90 - samples/sec: 1703.61 - lr: 0.000035 - momentum: 0.000000 2023-10-17 09:51:35,107 epoch 1 - iter 616/773 - loss 0.38248750 - time (sec): 57.99 - samples/sec: 1703.90 - lr: 0.000040 - momentum: 0.000000 2023-10-17 09:51:42,607 epoch 1 - iter 693/773 - loss 0.34805302 - time (sec): 65.49 - samples/sec: 1703.78 - lr: 0.000045 - momentum: 0.000000 2023-10-17 09:51:49,968 epoch 1 - iter 770/773 - loss 0.32227332 - time (sec): 72.85 - samples/sec: 1702.09 - lr: 0.000050 - momentum: 0.000000 2023-10-17 09:51:50,232 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:51:50,232 EPOCH 1 done: loss 0.3216 - lr: 0.000050 2023-10-17 09:51:52,930 DEV : loss 0.05330459401011467 - f1-score (micro avg) 0.7489 2023-10-17 09:51:52,960 saving best model 2023-10-17 09:51:53,514 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:52:00,448 epoch 2 - iter 77/773 - loss 0.09677124 - time (sec): 6.93 - samples/sec: 1704.73 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:52:07,574 epoch 2 - iter 154/773 - loss 0.08008441 - time (sec): 14.06 - samples/sec: 1718.53 - lr: 0.000049 - momentum: 0.000000 2023-10-17 09:52:14,632 epoch 2 - iter 231/773 - loss 0.07559641 - time (sec): 21.12 - samples/sec: 1785.39 - lr: 0.000048 - momentum: 0.000000 2023-10-17 09:52:21,666 epoch 2 - iter 308/773 - loss 0.07646491 - time (sec): 28.15 - samples/sec: 1780.87 - lr: 0.000048 - momentum: 0.000000 2023-10-17 09:52:28,587 epoch 2 - iter 385/773 - loss 0.07753478 - time (sec): 35.07 - samples/sec: 1789.21 - lr: 0.000047 - momentum: 0.000000 2023-10-17 09:52:35,688 epoch 2 - iter 462/773 - loss 0.07853793 - time (sec): 42.17 - samples/sec: 1776.69 - lr: 0.000047 - momentum: 0.000000 2023-10-17 09:52:42,856 epoch 2 - iter 539/773 - loss 0.07587851 - time (sec): 49.34 - samples/sec: 1772.60 - lr: 0.000046 - momentum: 0.000000 2023-10-17 09:52:50,035 epoch 2 - iter 616/773 - loss 0.07502133 - time (sec): 56.52 - samples/sec: 1778.78 - lr: 0.000046 - momentum: 0.000000 2023-10-17 09:52:57,050 epoch 2 - iter 693/773 - loss 0.07354922 - time (sec): 63.53 - samples/sec: 1766.03 - lr: 0.000045 - momentum: 0.000000 2023-10-17 09:53:03,992 epoch 2 - iter 770/773 - loss 0.07444958 - time (sec): 70.48 - samples/sec: 1759.81 - lr: 0.000044 - momentum: 0.000000 2023-10-17 09:53:04,245 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:53:04,245 EPOCH 2 done: loss 0.0746 - lr: 0.000044 2023-10-17 09:53:07,171 DEV : loss 0.06132051348686218 - f1-score (micro avg) 0.6713 2023-10-17 09:53:07,199 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:53:13,887 epoch 3 - iter 77/773 - loss 0.04795621 - time (sec): 6.69 - samples/sec: 1748.40 - lr: 0.000044 - momentum: 0.000000 2023-10-17 09:53:21,207 epoch 3 - iter 154/773 - loss 0.04943447 - time (sec): 14.01 - samples/sec: 1772.77 - lr: 0.000043 - momentum: 0.000000 2023-10-17 09:53:29,211 epoch 3 - iter 231/773 - loss 0.04886716 - time (sec): 22.01 - samples/sec: 1734.92 - lr: 0.000043 - momentum: 0.000000 2023-10-17 09:53:36,821 epoch 3 - iter 308/773 - loss 0.04640129 - time (sec): 29.62 - samples/sec: 1708.16 - lr: 0.000042 - momentum: 0.000000 2023-10-17 09:53:44,470 epoch 3 - iter 385/773 - loss 0.04816842 - time (sec): 37.27 - samples/sec: 1675.52 - lr: 0.000042 - momentum: 0.000000 2023-10-17 09:53:52,356 epoch 3 - iter 462/773 - loss 0.05098456 - time (sec): 45.15 - samples/sec: 1662.59 - lr: 0.000041 - momentum: 0.000000 2023-10-17 09:53:59,926 epoch 3 - iter 539/773 - loss 0.05101299 - time (sec): 52.73 - samples/sec: 1653.10 - lr: 0.000041 - momentum: 0.000000 2023-10-17 09:54:06,734 epoch 3 - iter 616/773 - loss 0.05078472 - time (sec): 59.53 - samples/sec: 1671.82 - lr: 0.000040 - momentum: 0.000000 2023-10-17 09:54:13,425 epoch 3 - iter 693/773 - loss 0.05251785 - time (sec): 66.22 - samples/sec: 1666.04 - lr: 0.000039 - momentum: 0.000000 2023-10-17 09:54:20,384 epoch 3 - iter 770/773 - loss 0.05247431 - time (sec): 73.18 - samples/sec: 1692.98 - lr: 0.000039 - momentum: 0.000000 2023-10-17 09:54:20,643 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:20,643 EPOCH 3 done: loss 0.0524 - lr: 0.000039 2023-10-17 09:54:23,554 DEV : loss 0.05568350851535797 - f1-score (micro avg) 0.7886 2023-10-17 09:54:23,582 saving best model 2023-10-17 09:54:24,988 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:54:31,745 epoch 4 - iter 77/773 - loss 0.03917114 - time (sec): 6.75 - samples/sec: 1903.55 - lr: 0.000038 - momentum: 0.000000 2023-10-17 09:54:38,162 epoch 4 - iter 154/773 - loss 0.03481858 - time (sec): 13.17 - samples/sec: 1862.13 - lr: 0.000038 - momentum: 0.000000 2023-10-17 09:54:45,104 epoch 4 - iter 231/773 - loss 0.03357236 - time (sec): 20.11 - samples/sec: 1865.14 - lr: 0.000037 - momentum: 0.000000 2023-10-17 09:54:52,050 epoch 4 - iter 308/773 - loss 0.03530401 - time (sec): 27.06 - samples/sec: 1847.93 - lr: 0.000037 - momentum: 0.000000 2023-10-17 09:54:58,597 epoch 4 - iter 385/773 - loss 0.03461690 - time (sec): 33.61 - samples/sec: 1857.33 - lr: 0.000036 - momentum: 0.000000 2023-10-17 09:55:05,499 epoch 4 - iter 462/773 - loss 0.03627980 - time (sec): 40.51 - samples/sec: 1859.14 - lr: 0.000036 - momentum: 0.000000 2023-10-17 09:55:12,971 epoch 4 - iter 539/773 - loss 0.03653929 - time (sec): 47.98 - samples/sec: 1834.41 - lr: 0.000035 - momentum: 0.000000 2023-10-17 09:55:19,826 epoch 4 - iter 616/773 - loss 0.03616690 - time (sec): 54.83 - samples/sec: 1818.25 - lr: 0.000034 - momentum: 0.000000 2023-10-17 09:55:27,192 epoch 4 - iter 693/773 - loss 0.03664448 - time (sec): 62.20 - samples/sec: 1792.69 - lr: 0.000034 - momentum: 0.000000 2023-10-17 09:55:34,633 epoch 4 - iter 770/773 - loss 0.03696210 - time (sec): 69.64 - samples/sec: 1779.84 - lr: 0.000033 - momentum: 0.000000 2023-10-17 09:55:34,898 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:34,898 EPOCH 4 done: loss 0.0372 - lr: 0.000033 2023-10-17 09:55:37,813 DEV : loss 0.09010311961174011 - f1-score (micro avg) 0.795 2023-10-17 09:55:37,844 saving best model 2023-10-17 09:55:39,252 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:55:46,196 epoch 5 - iter 77/773 - loss 0.02882609 - time (sec): 6.94 - samples/sec: 1709.68 - lr: 0.000033 - momentum: 0.000000 2023-10-17 09:55:53,133 epoch 5 - iter 154/773 - loss 0.02484139 - time (sec): 13.88 - samples/sec: 1750.43 - lr: 0.000032 - momentum: 0.000000 2023-10-17 09:56:00,065 epoch 5 - iter 231/773 - loss 0.02494855 - time (sec): 20.81 - samples/sec: 1738.11 - lr: 0.000032 - momentum: 0.000000 2023-10-17 09:56:07,071 epoch 5 - iter 308/773 - loss 0.02438607 - time (sec): 27.81 - samples/sec: 1740.43 - lr: 0.000031 - momentum: 0.000000 2023-10-17 09:56:14,264 epoch 5 - iter 385/773 - loss 0.02602136 - time (sec): 35.01 - samples/sec: 1753.53 - lr: 0.000031 - momentum: 0.000000 2023-10-17 09:56:21,203 epoch 5 - iter 462/773 - loss 0.02629902 - time (sec): 41.95 - samples/sec: 1764.17 - lr: 0.000030 - momentum: 0.000000 2023-10-17 09:56:28,542 epoch 5 - iter 539/773 - loss 0.02576945 - time (sec): 49.28 - samples/sec: 1755.14 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:56:35,568 epoch 5 - iter 616/773 - loss 0.02604685 - time (sec): 56.31 - samples/sec: 1752.21 - lr: 0.000029 - momentum: 0.000000 2023-10-17 09:56:42,730 epoch 5 - iter 693/773 - loss 0.02654183 - time (sec): 63.47 - samples/sec: 1763.08 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:56:50,088 epoch 5 - iter 770/773 - loss 0.02657177 - time (sec): 70.83 - samples/sec: 1747.09 - lr: 0.000028 - momentum: 0.000000 2023-10-17 09:56:50,394 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:56:50,395 EPOCH 5 done: loss 0.0266 - lr: 0.000028 2023-10-17 09:56:53,269 DEV : loss 0.1054750606417656 - f1-score (micro avg) 0.7785 2023-10-17 09:56:53,298 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:57:00,470 epoch 6 - iter 77/773 - loss 0.01388745 - time (sec): 7.17 - samples/sec: 1788.67 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:57:07,341 epoch 6 - iter 154/773 - loss 0.01126746 - time (sec): 14.04 - samples/sec: 1832.99 - lr: 0.000027 - momentum: 0.000000 2023-10-17 09:57:14,224 epoch 6 - iter 231/773 - loss 0.01379040 - time (sec): 20.92 - samples/sec: 1817.27 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:57:21,174 epoch 6 - iter 308/773 - loss 0.01635429 - time (sec): 27.87 - samples/sec: 1813.88 - lr: 0.000026 - momentum: 0.000000 2023-10-17 09:57:28,078 epoch 6 - iter 385/773 - loss 0.01738643 - time (sec): 34.78 - samples/sec: 1826.82 - lr: 0.000025 - momentum: 0.000000 2023-10-17 09:57:34,818 epoch 6 - iter 462/773 - loss 0.01857094 - time (sec): 41.52 - samples/sec: 1809.73 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:57:41,802 epoch 6 - iter 539/773 - loss 0.01798259 - time (sec): 48.50 - samples/sec: 1791.72 - lr: 0.000024 - momentum: 0.000000 2023-10-17 09:57:49,468 epoch 6 - iter 616/773 - loss 0.01750456 - time (sec): 56.17 - samples/sec: 1757.49 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:57:56,975 epoch 6 - iter 693/773 - loss 0.01733636 - time (sec): 63.68 - samples/sec: 1748.90 - lr: 0.000023 - momentum: 0.000000 2023-10-17 09:58:03,735 epoch 6 - iter 770/773 - loss 0.01777726 - time (sec): 70.44 - samples/sec: 1758.81 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:58:03,988 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:58:03,989 EPOCH 6 done: loss 0.0177 - lr: 0.000022 2023-10-17 09:58:07,150 DEV : loss 0.11583945155143738 - f1-score (micro avg) 0.7778 2023-10-17 09:58:07,202 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:58:13,949 epoch 7 - iter 77/773 - loss 0.00456103 - time (sec): 6.74 - samples/sec: 1737.20 - lr: 0.000022 - momentum: 0.000000 2023-10-17 09:58:20,748 epoch 7 - iter 154/773 - loss 0.01048448 - time (sec): 13.54 - samples/sec: 1752.44 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:58:27,539 epoch 7 - iter 231/773 - loss 0.01223233 - time (sec): 20.33 - samples/sec: 1781.78 - lr: 0.000021 - momentum: 0.000000 2023-10-17 09:58:34,652 epoch 7 - iter 308/773 - loss 0.01163819 - time (sec): 27.45 - samples/sec: 1783.17 - lr: 0.000020 - momentum: 0.000000 2023-10-17 09:58:42,661 epoch 7 - iter 385/773 - loss 0.01140701 - time (sec): 35.46 - samples/sec: 1736.84 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:58:50,477 epoch 7 - iter 462/773 - loss 0.01053450 - time (sec): 43.27 - samples/sec: 1708.48 - lr: 0.000019 - momentum: 0.000000 2023-10-17 09:58:57,739 epoch 7 - iter 539/773 - loss 0.01010124 - time (sec): 50.53 - samples/sec: 1701.36 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:59:04,851 epoch 7 - iter 616/773 - loss 0.00993634 - time (sec): 57.65 - samples/sec: 1718.81 - lr: 0.000018 - momentum: 0.000000 2023-10-17 09:59:12,153 epoch 7 - iter 693/773 - loss 0.01039778 - time (sec): 64.95 - samples/sec: 1722.45 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:59:19,114 epoch 7 - iter 770/773 - loss 0.01064432 - time (sec): 71.91 - samples/sec: 1720.07 - lr: 0.000017 - momentum: 0.000000 2023-10-17 09:59:19,398 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:59:19,398 EPOCH 7 done: loss 0.0106 - lr: 0.000017 2023-10-17 09:59:22,806 DEV : loss 0.11872334033250809 - f1-score (micro avg) 0.818 2023-10-17 09:59:22,836 saving best model 2023-10-17 09:59:24,267 ---------------------------------------------------------------------------------------------------- 2023-10-17 09:59:31,418 epoch 8 - iter 77/773 - loss 0.01473996 - time (sec): 7.14 - samples/sec: 1732.87 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:59:39,170 epoch 8 - iter 154/773 - loss 0.01203286 - time (sec): 14.90 - samples/sec: 1696.02 - lr: 0.000016 - momentum: 0.000000 2023-10-17 09:59:46,392 epoch 8 - iter 231/773 - loss 0.01161199 - time (sec): 22.12 - samples/sec: 1689.65 - lr: 0.000015 - momentum: 0.000000 2023-10-17 09:59:53,350 epoch 8 - iter 308/773 - loss 0.00966991 - time (sec): 29.08 - samples/sec: 1702.11 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:00:00,653 epoch 8 - iter 385/773 - loss 0.00955744 - time (sec): 36.38 - samples/sec: 1690.94 - lr: 0.000014 - momentum: 0.000000 2023-10-17 10:00:08,701 epoch 8 - iter 462/773 - loss 0.00952576 - time (sec): 44.43 - samples/sec: 1680.49 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:00:16,534 epoch 8 - iter 539/773 - loss 0.00909750 - time (sec): 52.26 - samples/sec: 1676.63 - lr: 0.000013 - momentum: 0.000000 2023-10-17 10:00:24,416 epoch 8 - iter 616/773 - loss 0.00903899 - time (sec): 60.14 - samples/sec: 1654.18 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:00:31,889 epoch 8 - iter 693/773 - loss 0.00893252 - time (sec): 67.61 - samples/sec: 1641.23 - lr: 0.000012 - momentum: 0.000000 2023-10-17 10:00:39,288 epoch 8 - iter 770/773 - loss 0.00862898 - time (sec): 75.01 - samples/sec: 1652.22 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:00:39,604 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:00:39,605 EPOCH 8 done: loss 0.0086 - lr: 0.000011 2023-10-17 10:00:42,778 DEV : loss 0.12361815571784973 - f1-score (micro avg) 0.778 2023-10-17 10:00:42,809 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:00:50,878 epoch 9 - iter 77/773 - loss 0.00424221 - time (sec): 8.07 - samples/sec: 1562.76 - lr: 0.000011 - momentum: 0.000000 2023-10-17 10:00:58,265 epoch 9 - iter 154/773 - loss 0.00382206 - time (sec): 15.45 - samples/sec: 1587.14 - lr: 0.000010 - momentum: 0.000000 2023-10-17 10:01:05,562 epoch 9 - iter 231/773 - loss 0.00494659 - time (sec): 22.75 - samples/sec: 1643.92 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:01:13,223 epoch 9 - iter 308/773 - loss 0.00538922 - time (sec): 30.41 - samples/sec: 1615.47 - lr: 0.000009 - momentum: 0.000000 2023-10-17 10:01:20,951 epoch 9 - iter 385/773 - loss 0.00585671 - time (sec): 38.14 - samples/sec: 1622.67 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:01:28,146 epoch 9 - iter 462/773 - loss 0.00592916 - time (sec): 45.33 - samples/sec: 1632.48 - lr: 0.000008 - momentum: 0.000000 2023-10-17 10:01:35,335 epoch 9 - iter 539/773 - loss 0.00543064 - time (sec): 52.52 - samples/sec: 1654.27 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:01:42,299 epoch 9 - iter 616/773 - loss 0.00549350 - time (sec): 59.49 - samples/sec: 1664.43 - lr: 0.000007 - momentum: 0.000000 2023-10-17 10:01:49,520 epoch 9 - iter 693/773 - loss 0.00513001 - time (sec): 66.71 - samples/sec: 1683.75 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:01:56,696 epoch 9 - iter 770/773 - loss 0.00549165 - time (sec): 73.89 - samples/sec: 1675.81 - lr: 0.000006 - momentum: 0.000000 2023-10-17 10:01:56,967 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:01:56,967 EPOCH 9 done: loss 0.0055 - lr: 0.000006 2023-10-17 10:01:59,842 DEV : loss 0.12155171483755112 - f1-score (micro avg) 0.7975 2023-10-17 10:01:59,871 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:02:06,780 epoch 10 - iter 77/773 - loss 0.00669265 - time (sec): 6.91 - samples/sec: 1812.77 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:02:13,747 epoch 10 - iter 154/773 - loss 0.00465994 - time (sec): 13.87 - samples/sec: 1786.79 - lr: 0.000005 - momentum: 0.000000 2023-10-17 10:02:20,720 epoch 10 - iter 231/773 - loss 0.00329970 - time (sec): 20.85 - samples/sec: 1815.30 - lr: 0.000004 - momentum: 0.000000 2023-10-17 10:02:27,633 epoch 10 - iter 308/773 - loss 0.00309264 - time (sec): 27.76 - samples/sec: 1810.49 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:02:34,497 epoch 10 - iter 385/773 - loss 0.00320190 - time (sec): 34.62 - samples/sec: 1805.14 - lr: 0.000003 - momentum: 0.000000 2023-10-17 10:02:41,428 epoch 10 - iter 462/773 - loss 0.00367371 - time (sec): 41.56 - samples/sec: 1786.35 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:02:48,391 epoch 10 - iter 539/773 - loss 0.00352687 - time (sec): 48.52 - samples/sec: 1793.81 - lr: 0.000002 - momentum: 0.000000 2023-10-17 10:02:55,392 epoch 10 - iter 616/773 - loss 0.00364918 - time (sec): 55.52 - samples/sec: 1780.19 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:03:02,317 epoch 10 - iter 693/773 - loss 0.00341373 - time (sec): 62.44 - samples/sec: 1784.63 - lr: 0.000001 - momentum: 0.000000 2023-10-17 10:03:09,513 epoch 10 - iter 770/773 - loss 0.00316098 - time (sec): 69.64 - samples/sec: 1778.30 - lr: 0.000000 - momentum: 0.000000 2023-10-17 10:03:09,772 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:03:09,772 EPOCH 10 done: loss 0.0031 - lr: 0.000000 2023-10-17 10:03:12,617 DEV : loss 0.12296666949987411 - f1-score (micro avg) 0.7942 2023-10-17 10:03:13,213 ---------------------------------------------------------------------------------------------------- 2023-10-17 10:03:13,215 Loading model from best epoch ... 2023-10-17 10:03:15,488 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-BUILDING, B-BUILDING, E-BUILDING, I-BUILDING, S-STREET, B-STREET, E-STREET, I-STREET 2023-10-17 10:03:23,976 Results: - F-score (micro) 0.7995 - F-score (macro) 0.7004 - Accuracy 0.6805 By class: precision recall f1-score support LOC 0.8866 0.8182 0.8510 946 BUILDING 0.6207 0.4865 0.5455 185 STREET 0.7551 0.6607 0.7048 56 micro avg 0.8444 0.7591 0.7995 1187 macro avg 0.7541 0.6551 0.7004 1187 weighted avg 0.8390 0.7591 0.7965 1187 2023-10-17 10:03:23,976 ----------------------------------------------------------------------------------------------------