2023-10-17 08:30:24,886 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,887 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 08:30:24,887 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,888 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-17 08:30:24,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,888 Train: 1100 sentences 2023-10-17 08:30:24,888 (train_with_dev=False, train_with_test=False) 2023-10-17 08:30:24,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,888 Training Params: 2023-10-17 08:30:24,888 - learning_rate: "5e-05" 2023-10-17 08:30:24,888 - mini_batch_size: "4" 2023-10-17 08:30:24,888 - max_epochs: "10" 2023-10-17 08:30:24,888 - shuffle: "True" 2023-10-17 08:30:24,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,888 Plugins: 2023-10-17 08:30:24,888 - TensorboardLogger 2023-10-17 08:30:24,888 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 08:30:24,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,888 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 08:30:24,888 - metric: "('micro avg', 'f1-score')" 2023-10-17 08:30:24,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,888 Computation: 2023-10-17 08:30:24,888 - compute on device: cuda:0 2023-10-17 08:30:24,888 - embedding storage: none 2023-10-17 08:30:24,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,888 Model training base path: "hmbench-ajmc/de-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-17 08:30:24,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,888 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:24,889 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 08:30:26,127 epoch 1 - iter 27/275 - loss 4.10845835 - time (sec): 1.24 - samples/sec: 1804.32 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:30:27,353 epoch 1 - iter 54/275 - loss 3.17367960 - time (sec): 2.46 - samples/sec: 1703.46 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:30:28,561 epoch 1 - iter 81/275 - loss 2.52030692 - time (sec): 3.67 - samples/sec: 1694.03 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:30:29,770 epoch 1 - iter 108/275 - loss 2.06764029 - time (sec): 4.88 - samples/sec: 1710.52 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:30:31,000 epoch 1 - iter 135/275 - loss 1.71191043 - time (sec): 6.11 - samples/sec: 1771.78 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:30:32,215 epoch 1 - iter 162/275 - loss 1.48728930 - time (sec): 7.33 - samples/sec: 1784.57 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:30:33,435 epoch 1 - iter 189/275 - loss 1.31869583 - time (sec): 8.55 - samples/sec: 1786.00 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:30:34,657 epoch 1 - iter 216/275 - loss 1.17733301 - time (sec): 9.77 - samples/sec: 1804.59 - lr: 0.000039 - momentum: 0.000000 2023-10-17 08:30:35,866 epoch 1 - iter 243/275 - loss 1.07654199 - time (sec): 10.98 - samples/sec: 1818.69 - lr: 0.000044 - momentum: 0.000000 2023-10-17 08:30:37,123 epoch 1 - iter 270/275 - loss 0.99539183 - time (sec): 12.23 - samples/sec: 1826.36 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:30:37,353 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:37,354 EPOCH 1 done: loss 0.9843 - lr: 0.000049 2023-10-17 08:30:38,059 DEV : loss 0.19101294875144958 - f1-score (micro avg) 0.7729 2023-10-17 08:30:38,064 saving best model 2023-10-17 08:30:38,426 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:39,671 epoch 2 - iter 27/275 - loss 0.16708351 - time (sec): 1.24 - samples/sec: 1894.18 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:30:40,921 epoch 2 - iter 54/275 - loss 0.18795173 - time (sec): 2.49 - samples/sec: 1927.42 - lr: 0.000049 - momentum: 0.000000 2023-10-17 08:30:42,140 epoch 2 - iter 81/275 - loss 0.17869383 - time (sec): 3.71 - samples/sec: 1910.18 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:30:43,408 epoch 2 - iter 108/275 - loss 0.17781151 - time (sec): 4.98 - samples/sec: 1905.89 - lr: 0.000048 - momentum: 0.000000 2023-10-17 08:30:44,640 epoch 2 - iter 135/275 - loss 0.17499214 - time (sec): 6.21 - samples/sec: 1844.41 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:30:45,870 epoch 2 - iter 162/275 - loss 0.16398297 - time (sec): 7.44 - samples/sec: 1807.37 - lr: 0.000047 - momentum: 0.000000 2023-10-17 08:30:47,082 epoch 2 - iter 189/275 - loss 0.16445697 - time (sec): 8.65 - samples/sec: 1793.67 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:30:48,304 epoch 2 - iter 216/275 - loss 0.16253484 - time (sec): 9.88 - samples/sec: 1814.23 - lr: 0.000046 - momentum: 0.000000 2023-10-17 08:30:49,543 epoch 2 - iter 243/275 - loss 0.16341929 - time (sec): 11.12 - samples/sec: 1808.45 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:30:50,753 epoch 2 - iter 270/275 - loss 0.16569683 - time (sec): 12.33 - samples/sec: 1815.68 - lr: 0.000045 - momentum: 0.000000 2023-10-17 08:30:50,981 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:50,981 EPOCH 2 done: loss 0.1658 - lr: 0.000045 2023-10-17 08:30:51,625 DEV : loss 0.2656913697719574 - f1-score (micro avg) 0.7348 2023-10-17 08:30:51,630 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:30:52,857 epoch 3 - iter 27/275 - loss 0.12314013 - time (sec): 1.23 - samples/sec: 1896.85 - lr: 0.000044 - momentum: 0.000000 2023-10-17 08:30:54,075 epoch 3 - iter 54/275 - loss 0.12069488 - time (sec): 2.44 - samples/sec: 1842.53 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:30:55,288 epoch 3 - iter 81/275 - loss 0.12029024 - time (sec): 3.66 - samples/sec: 1873.03 - lr: 0.000043 - momentum: 0.000000 2023-10-17 08:30:56,483 epoch 3 - iter 108/275 - loss 0.12416397 - time (sec): 4.85 - samples/sec: 1849.46 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:30:57,647 epoch 3 - iter 135/275 - loss 0.10890458 - time (sec): 6.02 - samples/sec: 1862.82 - lr: 0.000042 - momentum: 0.000000 2023-10-17 08:30:58,819 epoch 3 - iter 162/275 - loss 0.11047994 - time (sec): 7.19 - samples/sec: 1871.39 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:30:59,972 epoch 3 - iter 189/275 - loss 0.11415845 - time (sec): 8.34 - samples/sec: 1882.14 - lr: 0.000041 - momentum: 0.000000 2023-10-17 08:31:01,131 epoch 3 - iter 216/275 - loss 0.11185680 - time (sec): 9.50 - samples/sec: 1855.80 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:31:02,300 epoch 3 - iter 243/275 - loss 0.10926904 - time (sec): 10.67 - samples/sec: 1871.98 - lr: 0.000040 - momentum: 0.000000 2023-10-17 08:31:03,475 epoch 3 - iter 270/275 - loss 0.11069511 - time (sec): 11.84 - samples/sec: 1883.24 - lr: 0.000039 - momentum: 0.000000 2023-10-17 08:31:03,691 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:31:03,691 EPOCH 3 done: loss 0.1092 - lr: 0.000039 2023-10-17 08:31:04,328 DEV : loss 0.18171313405036926 - f1-score (micro avg) 0.8424 2023-10-17 08:31:04,333 saving best model 2023-10-17 08:31:04,782 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:31:06,088 epoch 4 - iter 27/275 - loss 0.08230065 - time (sec): 1.30 - samples/sec: 1736.86 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:31:07,406 epoch 4 - iter 54/275 - loss 0.10509858 - time (sec): 2.62 - samples/sec: 1716.68 - lr: 0.000038 - momentum: 0.000000 2023-10-17 08:31:08,707 epoch 4 - iter 81/275 - loss 0.08322093 - time (sec): 3.92 - samples/sec: 1751.31 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:31:09,911 epoch 4 - iter 108/275 - loss 0.08218280 - time (sec): 5.13 - samples/sec: 1759.04 - lr: 0.000037 - momentum: 0.000000 2023-10-17 08:31:11,169 epoch 4 - iter 135/275 - loss 0.08625605 - time (sec): 6.39 - samples/sec: 1784.18 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:31:12,405 epoch 4 - iter 162/275 - loss 0.08354399 - time (sec): 7.62 - samples/sec: 1797.28 - lr: 0.000036 - momentum: 0.000000 2023-10-17 08:31:13,620 epoch 4 - iter 189/275 - loss 0.08663562 - time (sec): 8.84 - samples/sec: 1788.94 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:31:14,825 epoch 4 - iter 216/275 - loss 0.08498872 - time (sec): 10.04 - samples/sec: 1787.07 - lr: 0.000035 - momentum: 0.000000 2023-10-17 08:31:16,092 epoch 4 - iter 243/275 - loss 0.08424941 - time (sec): 11.31 - samples/sec: 1801.40 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:31:17,382 epoch 4 - iter 270/275 - loss 0.08718953 - time (sec): 12.60 - samples/sec: 1768.08 - lr: 0.000034 - momentum: 0.000000 2023-10-17 08:31:17,609 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:31:17,609 EPOCH 4 done: loss 0.0889 - lr: 0.000034 2023-10-17 08:31:18,251 DEV : loss 0.18058346211910248 - f1-score (micro avg) 0.8699 2023-10-17 08:31:18,257 saving best model 2023-10-17 08:31:18,710 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:31:20,044 epoch 5 - iter 27/275 - loss 0.03783434 - time (sec): 1.33 - samples/sec: 1921.10 - lr: 0.000033 - momentum: 0.000000 2023-10-17 08:31:21,256 epoch 5 - iter 54/275 - loss 0.05895032 - time (sec): 2.54 - samples/sec: 1795.35 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:31:22,548 epoch 5 - iter 81/275 - loss 0.07043600 - time (sec): 3.84 - samples/sec: 1736.18 - lr: 0.000032 - momentum: 0.000000 2023-10-17 08:31:23,903 epoch 5 - iter 108/275 - loss 0.05868539 - time (sec): 5.19 - samples/sec: 1700.63 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:31:25,218 epoch 5 - iter 135/275 - loss 0.05553262 - time (sec): 6.51 - samples/sec: 1703.83 - lr: 0.000031 - momentum: 0.000000 2023-10-17 08:31:26,477 epoch 5 - iter 162/275 - loss 0.06043531 - time (sec): 7.76 - samples/sec: 1730.87 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:31:27,716 epoch 5 - iter 189/275 - loss 0.06355344 - time (sec): 9.00 - samples/sec: 1749.48 - lr: 0.000030 - momentum: 0.000000 2023-10-17 08:31:28,949 epoch 5 - iter 216/275 - loss 0.06329613 - time (sec): 10.24 - samples/sec: 1772.06 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:31:30,176 epoch 5 - iter 243/275 - loss 0.06244041 - time (sec): 11.46 - samples/sec: 1772.11 - lr: 0.000029 - momentum: 0.000000 2023-10-17 08:31:31,432 epoch 5 - iter 270/275 - loss 0.06613267 - time (sec): 12.72 - samples/sec: 1760.73 - lr: 0.000028 - momentum: 0.000000 2023-10-17 08:31:31,649 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:31:31,649 EPOCH 5 done: loss 0.0655 - lr: 0.000028 2023-10-17 08:31:32,297 DEV : loss 0.17223069071769714 - f1-score (micro avg) 0.8575 2023-10-17 08:31:32,301 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:31:33,556 epoch 6 - iter 27/275 - loss 0.06466137 - time (sec): 1.25 - samples/sec: 1815.06 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:31:34,796 epoch 6 - iter 54/275 - loss 0.03969681 - time (sec): 2.49 - samples/sec: 1735.98 - lr: 0.000027 - momentum: 0.000000 2023-10-17 08:31:36,038 epoch 6 - iter 81/275 - loss 0.03526760 - time (sec): 3.74 - samples/sec: 1773.15 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:31:37,286 epoch 6 - iter 108/275 - loss 0.04104000 - time (sec): 4.98 - samples/sec: 1763.04 - lr: 0.000026 - momentum: 0.000000 2023-10-17 08:31:38,542 epoch 6 - iter 135/275 - loss 0.04349194 - time (sec): 6.24 - samples/sec: 1771.36 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:31:39,782 epoch 6 - iter 162/275 - loss 0.04815338 - time (sec): 7.48 - samples/sec: 1803.36 - lr: 0.000025 - momentum: 0.000000 2023-10-17 08:31:41,049 epoch 6 - iter 189/275 - loss 0.05238140 - time (sec): 8.75 - samples/sec: 1817.49 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:31:42,284 epoch 6 - iter 216/275 - loss 0.05213888 - time (sec): 9.98 - samples/sec: 1818.37 - lr: 0.000024 - momentum: 0.000000 2023-10-17 08:31:43,521 epoch 6 - iter 243/275 - loss 0.05007935 - time (sec): 11.22 - samples/sec: 1804.55 - lr: 0.000023 - momentum: 0.000000 2023-10-17 08:31:44,786 epoch 6 - iter 270/275 - loss 0.04794810 - time (sec): 12.48 - samples/sec: 1797.17 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:31:45,002 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:31:45,002 EPOCH 6 done: loss 0.0472 - lr: 0.000022 2023-10-17 08:31:45,637 DEV : loss 0.19167360663414001 - f1-score (micro avg) 0.8523 2023-10-17 08:31:45,642 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:31:46,858 epoch 7 - iter 27/275 - loss 0.03846715 - time (sec): 1.22 - samples/sec: 1998.08 - lr: 0.000022 - momentum: 0.000000 2023-10-17 08:31:48,099 epoch 7 - iter 54/275 - loss 0.02789805 - time (sec): 2.46 - samples/sec: 1955.07 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:31:49,294 epoch 7 - iter 81/275 - loss 0.04892824 - time (sec): 3.65 - samples/sec: 1874.35 - lr: 0.000021 - momentum: 0.000000 2023-10-17 08:31:50,545 epoch 7 - iter 108/275 - loss 0.03984377 - time (sec): 4.90 - samples/sec: 1856.05 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:31:51,820 epoch 7 - iter 135/275 - loss 0.03669472 - time (sec): 6.18 - samples/sec: 1829.51 - lr: 0.000020 - momentum: 0.000000 2023-10-17 08:31:53,042 epoch 7 - iter 162/275 - loss 0.03633068 - time (sec): 7.40 - samples/sec: 1817.52 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:31:54,267 epoch 7 - iter 189/275 - loss 0.03456830 - time (sec): 8.62 - samples/sec: 1808.73 - lr: 0.000019 - momentum: 0.000000 2023-10-17 08:31:55,550 epoch 7 - iter 216/275 - loss 0.03372537 - time (sec): 9.91 - samples/sec: 1802.93 - lr: 0.000018 - momentum: 0.000000 2023-10-17 08:31:56,767 epoch 7 - iter 243/275 - loss 0.03209081 - time (sec): 11.12 - samples/sec: 1801.78 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:31:57,977 epoch 7 - iter 270/275 - loss 0.03176010 - time (sec): 12.33 - samples/sec: 1810.58 - lr: 0.000017 - momentum: 0.000000 2023-10-17 08:31:58,216 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:31:58,216 EPOCH 7 done: loss 0.0315 - lr: 0.000017 2023-10-17 08:31:58,850 DEV : loss 0.19586947560310364 - f1-score (micro avg) 0.8571 2023-10-17 08:31:58,855 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:32:00,093 epoch 8 - iter 27/275 - loss 0.02666517 - time (sec): 1.24 - samples/sec: 1830.72 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:32:01,313 epoch 8 - iter 54/275 - loss 0.01392626 - time (sec): 2.46 - samples/sec: 1775.87 - lr: 0.000016 - momentum: 0.000000 2023-10-17 08:32:02,537 epoch 8 - iter 81/275 - loss 0.01804389 - time (sec): 3.68 - samples/sec: 1792.60 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:32:03,815 epoch 8 - iter 108/275 - loss 0.01817145 - time (sec): 4.96 - samples/sec: 1803.23 - lr: 0.000015 - momentum: 0.000000 2023-10-17 08:32:05,021 epoch 8 - iter 135/275 - loss 0.01518644 - time (sec): 6.16 - samples/sec: 1823.56 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:32:06,300 epoch 8 - iter 162/275 - loss 0.01839497 - time (sec): 7.44 - samples/sec: 1813.93 - lr: 0.000014 - momentum: 0.000000 2023-10-17 08:32:07,504 epoch 8 - iter 189/275 - loss 0.02003524 - time (sec): 8.65 - samples/sec: 1802.85 - lr: 0.000013 - momentum: 0.000000 2023-10-17 08:32:08,743 epoch 8 - iter 216/275 - loss 0.02011609 - time (sec): 9.89 - samples/sec: 1819.62 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:32:09,981 epoch 8 - iter 243/275 - loss 0.02210028 - time (sec): 11.12 - samples/sec: 1801.95 - lr: 0.000012 - momentum: 0.000000 2023-10-17 08:32:11,251 epoch 8 - iter 270/275 - loss 0.02672486 - time (sec): 12.39 - samples/sec: 1805.66 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:32:11,486 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:32:11,486 EPOCH 8 done: loss 0.0263 - lr: 0.000011 2023-10-17 08:32:12,131 DEV : loss 0.21000421047210693 - f1-score (micro avg) 0.8641 2023-10-17 08:32:12,135 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:32:13,364 epoch 9 - iter 27/275 - loss 0.00305086 - time (sec): 1.23 - samples/sec: 1730.74 - lr: 0.000011 - momentum: 0.000000 2023-10-17 08:32:14,580 epoch 9 - iter 54/275 - loss 0.00270286 - time (sec): 2.44 - samples/sec: 1755.39 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:32:15,807 epoch 9 - iter 81/275 - loss 0.00203368 - time (sec): 3.67 - samples/sec: 1761.21 - lr: 0.000010 - momentum: 0.000000 2023-10-17 08:32:17,082 epoch 9 - iter 108/275 - loss 0.01780784 - time (sec): 4.95 - samples/sec: 1810.17 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:32:18,335 epoch 9 - iter 135/275 - loss 0.01967582 - time (sec): 6.20 - samples/sec: 1770.73 - lr: 0.000009 - momentum: 0.000000 2023-10-17 08:32:19,548 epoch 9 - iter 162/275 - loss 0.01748341 - time (sec): 7.41 - samples/sec: 1794.99 - lr: 0.000008 - momentum: 0.000000 2023-10-17 08:32:20,778 epoch 9 - iter 189/275 - loss 0.01562891 - time (sec): 8.64 - samples/sec: 1799.38 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:32:22,019 epoch 9 - iter 216/275 - loss 0.01607793 - time (sec): 9.88 - samples/sec: 1796.68 - lr: 0.000007 - momentum: 0.000000 2023-10-17 08:32:23,257 epoch 9 - iter 243/275 - loss 0.01746249 - time (sec): 11.12 - samples/sec: 1804.09 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:32:24,489 epoch 9 - iter 270/275 - loss 0.01638516 - time (sec): 12.35 - samples/sec: 1807.07 - lr: 0.000006 - momentum: 0.000000 2023-10-17 08:32:24,723 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:32:24,723 EPOCH 9 done: loss 0.0161 - lr: 0.000006 2023-10-17 08:32:25,360 DEV : loss 0.2169550061225891 - f1-score (micro avg) 0.8599 2023-10-17 08:32:25,365 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:32:26,559 epoch 10 - iter 27/275 - loss 0.00307588 - time (sec): 1.19 - samples/sec: 1836.56 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:32:27,773 epoch 10 - iter 54/275 - loss 0.00313909 - time (sec): 2.41 - samples/sec: 1856.72 - lr: 0.000005 - momentum: 0.000000 2023-10-17 08:32:29,004 epoch 10 - iter 81/275 - loss 0.00346169 - time (sec): 3.64 - samples/sec: 1880.55 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:32:30,337 epoch 10 - iter 108/275 - loss 0.01610834 - time (sec): 4.97 - samples/sec: 1857.99 - lr: 0.000004 - momentum: 0.000000 2023-10-17 08:32:31,548 epoch 10 - iter 135/275 - loss 0.01454205 - time (sec): 6.18 - samples/sec: 1797.86 - lr: 0.000003 - momentum: 0.000000 2023-10-17 08:32:32,814 epoch 10 - iter 162/275 - loss 0.01210467 - time (sec): 7.45 - samples/sec: 1822.92 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:32:34,057 epoch 10 - iter 189/275 - loss 0.01192225 - time (sec): 8.69 - samples/sec: 1832.07 - lr: 0.000002 - momentum: 0.000000 2023-10-17 08:32:35,281 epoch 10 - iter 216/275 - loss 0.01198531 - time (sec): 9.91 - samples/sec: 1825.72 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:32:36,533 epoch 10 - iter 243/275 - loss 0.01311181 - time (sec): 11.17 - samples/sec: 1795.65 - lr: 0.000001 - momentum: 0.000000 2023-10-17 08:32:37,753 epoch 10 - iter 270/275 - loss 0.01318977 - time (sec): 12.39 - samples/sec: 1799.00 - lr: 0.000000 - momentum: 0.000000 2023-10-17 08:32:37,984 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:32:37,984 EPOCH 10 done: loss 0.0129 - lr: 0.000000 2023-10-17 08:32:38,623 DEV : loss 0.2227589339017868 - f1-score (micro avg) 0.8595 2023-10-17 08:32:38,969 ---------------------------------------------------------------------------------------------------- 2023-10-17 08:32:38,970 Loading model from best epoch ... 2023-10-17 08:32:40,335 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-17 08:32:41,143 Results: - F-score (micro) 0.8802 - F-score (macro) 0.6068 - Accuracy 0.8048 By class: precision recall f1-score support scope 0.8851 0.8750 0.8800 176 pers 0.9231 0.9375 0.9302 128 work 0.7975 0.8514 0.8235 74 loc 0.3333 0.5000 0.4000 2 object 0.0000 0.0000 0.0000 2 micro avg 0.8756 0.8848 0.8802 382 macro avg 0.5878 0.6328 0.6068 382 weighted avg 0.8733 0.8848 0.8788 382 2023-10-17 08:32:41,143 ----------------------------------------------------------------------------------------------------