2023-10-13 08:11:11,974 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:11,975 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 08:11:11,975 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:11,976 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-13 08:11:11,976 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:11,976 Train: 1100 sentences 2023-10-13 08:11:11,976 (train_with_dev=False, train_with_test=False) 2023-10-13 08:11:11,976 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:11,976 Training Params: 2023-10-13 08:11:11,976 - learning_rate: "3e-05" 2023-10-13 08:11:11,976 - mini_batch_size: "4" 2023-10-13 08:11:11,976 - max_epochs: "10" 2023-10-13 08:11:11,976 - shuffle: "True" 2023-10-13 08:11:11,976 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:11,976 Plugins: 2023-10-13 08:11:11,976 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 08:11:11,976 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:11,976 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 08:11:11,976 - metric: "('micro avg', 'f1-score')" 2023-10-13 08:11:11,976 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:11,976 Computation: 2023-10-13 08:11:11,976 - compute on device: cuda:0 2023-10-13 08:11:11,976 - embedding storage: none 2023-10-13 08:11:11,976 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:11,976 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-13 08:11:11,976 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:11,976 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:14,096 epoch 1 - iter 27/275 - loss 3.41344393 - time (sec): 2.12 - samples/sec: 948.02 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:11:15,305 epoch 1 - iter 54/275 - loss 3.04522400 - time (sec): 3.33 - samples/sec: 1292.44 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:11:16,450 epoch 1 - iter 81/275 - loss 2.43708544 - time (sec): 4.47 - samples/sec: 1474.71 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:11:17,588 epoch 1 - iter 108/275 - loss 2.03067121 - time (sec): 5.61 - samples/sec: 1548.78 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:11:18,736 epoch 1 - iter 135/275 - loss 1.74150180 - time (sec): 6.76 - samples/sec: 1625.01 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:11:19,897 epoch 1 - iter 162/275 - loss 1.52957870 - time (sec): 7.92 - samples/sec: 1667.91 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:11:21,094 epoch 1 - iter 189/275 - loss 1.37151362 - time (sec): 9.12 - samples/sec: 1704.72 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:11:22,275 epoch 1 - iter 216/275 - loss 1.25757942 - time (sec): 10.30 - samples/sec: 1714.05 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:11:23,488 epoch 1 - iter 243/275 - loss 1.16205347 - time (sec): 11.51 - samples/sec: 1734.45 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:11:24,675 epoch 1 - iter 270/275 - loss 1.07527020 - time (sec): 12.70 - samples/sec: 1761.36 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:11:24,901 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:24,901 EPOCH 1 done: loss 1.0626 - lr: 0.000029 2023-10-13 08:11:25,495 DEV : loss 0.26535165309906006 - f1-score (micro avg) 0.6651 2023-10-13 08:11:25,505 saving best model 2023-10-13 08:11:25,977 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:27,434 epoch 2 - iter 27/275 - loss 0.28291135 - time (sec): 1.46 - samples/sec: 1409.65 - lr: 0.000030 - momentum: 0.000000 2023-10-13 08:11:28,640 epoch 2 - iter 54/275 - loss 0.28411411 - time (sec): 2.66 - samples/sec: 1600.51 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:11:29,903 epoch 2 - iter 81/275 - loss 0.26595659 - time (sec): 3.92 - samples/sec: 1592.39 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:11:31,103 epoch 2 - iter 108/275 - loss 0.24583362 - time (sec): 5.12 - samples/sec: 1671.33 - lr: 0.000029 - momentum: 0.000000 2023-10-13 08:11:32,271 epoch 2 - iter 135/275 - loss 0.23033553 - time (sec): 6.29 - samples/sec: 1746.89 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:11:33,517 epoch 2 - iter 162/275 - loss 0.21824353 - time (sec): 7.54 - samples/sec: 1737.36 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:11:34,745 epoch 2 - iter 189/275 - loss 0.20901579 - time (sec): 8.77 - samples/sec: 1757.63 - lr: 0.000028 - momentum: 0.000000 2023-10-13 08:11:35,978 epoch 2 - iter 216/275 - loss 0.20381171 - time (sec): 10.00 - samples/sec: 1770.95 - lr: 0.000027 - momentum: 0.000000 2023-10-13 08:11:37,260 epoch 2 - iter 243/275 - loss 0.19606007 - time (sec): 11.28 - samples/sec: 1767.74 - lr: 0.000027 - momentum: 0.000000 2023-10-13 08:11:38,477 epoch 2 - iter 270/275 - loss 0.19753698 - time (sec): 12.50 - samples/sec: 1788.59 - lr: 0.000027 - momentum: 0.000000 2023-10-13 08:11:38,715 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:38,715 EPOCH 2 done: loss 0.1966 - lr: 0.000027 2023-10-13 08:11:39,429 DEV : loss 0.14341259002685547 - f1-score (micro avg) 0.8225 2023-10-13 08:11:39,433 saving best model 2023-10-13 08:11:40,074 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:41,246 epoch 3 - iter 27/275 - loss 0.14052391 - time (sec): 1.17 - samples/sec: 1804.82 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:11:42,416 epoch 3 - iter 54/275 - loss 0.11470335 - time (sec): 2.34 - samples/sec: 1907.08 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:11:43,589 epoch 3 - iter 81/275 - loss 0.11498850 - time (sec): 3.51 - samples/sec: 1916.97 - lr: 0.000026 - momentum: 0.000000 2023-10-13 08:11:44,783 epoch 3 - iter 108/275 - loss 0.12164868 - time (sec): 4.71 - samples/sec: 1940.40 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:11:45,992 epoch 3 - iter 135/275 - loss 0.11937486 - time (sec): 5.92 - samples/sec: 1907.64 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:11:47,232 epoch 3 - iter 162/275 - loss 0.11352774 - time (sec): 7.16 - samples/sec: 1877.03 - lr: 0.000025 - momentum: 0.000000 2023-10-13 08:11:48,465 epoch 3 - iter 189/275 - loss 0.10844061 - time (sec): 8.39 - samples/sec: 1882.81 - lr: 0.000024 - momentum: 0.000000 2023-10-13 08:11:49,656 epoch 3 - iter 216/275 - loss 0.11716378 - time (sec): 9.58 - samples/sec: 1890.06 - lr: 0.000024 - momentum: 0.000000 2023-10-13 08:11:50,825 epoch 3 - iter 243/275 - loss 0.11039802 - time (sec): 10.75 - samples/sec: 1867.89 - lr: 0.000024 - momentum: 0.000000 2023-10-13 08:11:52,038 epoch 3 - iter 270/275 - loss 0.11167883 - time (sec): 11.96 - samples/sec: 1867.36 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:11:52,258 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:52,258 EPOCH 3 done: loss 0.1124 - lr: 0.000023 2023-10-13 08:11:52,964 DEV : loss 0.1699625849723816 - f1-score (micro avg) 0.8219 2023-10-13 08:11:52,968 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:11:54,180 epoch 4 - iter 27/275 - loss 0.07822413 - time (sec): 1.21 - samples/sec: 1936.44 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:11:55,369 epoch 4 - iter 54/275 - loss 0.06541721 - time (sec): 2.40 - samples/sec: 1994.01 - lr: 0.000023 - momentum: 0.000000 2023-10-13 08:11:56,534 epoch 4 - iter 81/275 - loss 0.08294608 - time (sec): 3.56 - samples/sec: 1970.79 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:11:57,734 epoch 4 - iter 108/275 - loss 0.08110684 - time (sec): 4.77 - samples/sec: 1884.74 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:11:58,920 epoch 4 - iter 135/275 - loss 0.07760535 - time (sec): 5.95 - samples/sec: 1893.24 - lr: 0.000022 - momentum: 0.000000 2023-10-13 08:12:00,144 epoch 4 - iter 162/275 - loss 0.08207656 - time (sec): 7.17 - samples/sec: 1869.14 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:12:01,281 epoch 4 - iter 189/275 - loss 0.08297912 - time (sec): 8.31 - samples/sec: 1888.93 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:12:02,415 epoch 4 - iter 216/275 - loss 0.07657454 - time (sec): 9.45 - samples/sec: 1877.55 - lr: 0.000021 - momentum: 0.000000 2023-10-13 08:12:03,558 epoch 4 - iter 243/275 - loss 0.07536046 - time (sec): 10.59 - samples/sec: 1886.54 - lr: 0.000020 - momentum: 0.000000 2023-10-13 08:12:04,696 epoch 4 - iter 270/275 - loss 0.08231829 - time (sec): 11.73 - samples/sec: 1907.55 - lr: 0.000020 - momentum: 0.000000 2023-10-13 08:12:04,908 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:12:04,908 EPOCH 4 done: loss 0.0836 - lr: 0.000020 2023-10-13 08:12:05,597 DEV : loss 0.18230104446411133 - f1-score (micro avg) 0.837 2023-10-13 08:12:05,601 saving best model 2023-10-13 08:12:06,054 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:12:07,291 epoch 5 - iter 27/275 - loss 0.05892632 - time (sec): 1.24 - samples/sec: 1574.79 - lr: 0.000020 - momentum: 0.000000 2023-10-13 08:12:08,517 epoch 5 - iter 54/275 - loss 0.04236338 - time (sec): 2.46 - samples/sec: 1650.75 - lr: 0.000019 - momentum: 0.000000 2023-10-13 08:12:09,771 epoch 5 - iter 81/275 - loss 0.04541577 - time (sec): 3.72 - samples/sec: 1782.19 - lr: 0.000019 - momentum: 0.000000 2023-10-13 08:12:10,956 epoch 5 - iter 108/275 - loss 0.04872586 - time (sec): 4.90 - samples/sec: 1850.56 - lr: 0.000019 - momentum: 0.000000 2023-10-13 08:12:12,114 epoch 5 - iter 135/275 - loss 0.04698697 - time (sec): 6.06 - samples/sec: 1887.81 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:12:13,300 epoch 5 - iter 162/275 - loss 0.05183769 - time (sec): 7.24 - samples/sec: 1896.20 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:12:14,480 epoch 5 - iter 189/275 - loss 0.06089727 - time (sec): 8.42 - samples/sec: 1877.66 - lr: 0.000018 - momentum: 0.000000 2023-10-13 08:12:15,659 epoch 5 - iter 216/275 - loss 0.06591564 - time (sec): 9.60 - samples/sec: 1879.51 - lr: 0.000017 - momentum: 0.000000 2023-10-13 08:12:16,866 epoch 5 - iter 243/275 - loss 0.06742207 - time (sec): 10.81 - samples/sec: 1865.01 - lr: 0.000017 - momentum: 0.000000 2023-10-13 08:12:18,117 epoch 5 - iter 270/275 - loss 0.06374218 - time (sec): 12.06 - samples/sec: 1858.11 - lr: 0.000017 - momentum: 0.000000 2023-10-13 08:12:18,357 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:12:18,358 EPOCH 5 done: loss 0.0641 - lr: 0.000017 2023-10-13 08:12:19,077 DEV : loss 0.16086943447589874 - f1-score (micro avg) 0.8507 2023-10-13 08:12:19,082 saving best model 2023-10-13 08:12:19,703 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:12:20,959 epoch 6 - iter 27/275 - loss 0.06482260 - time (sec): 1.25 - samples/sec: 1651.89 - lr: 0.000016 - momentum: 0.000000 2023-10-13 08:12:22,261 epoch 6 - iter 54/275 - loss 0.05022329 - time (sec): 2.56 - samples/sec: 1712.96 - lr: 0.000016 - momentum: 0.000000 2023-10-13 08:12:23,586 epoch 6 - iter 81/275 - loss 0.04459889 - time (sec): 3.88 - samples/sec: 1681.97 - lr: 0.000016 - momentum: 0.000000 2023-10-13 08:12:24,931 epoch 6 - iter 108/275 - loss 0.05034734 - time (sec): 5.23 - samples/sec: 1691.27 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:12:26,185 epoch 6 - iter 135/275 - loss 0.04794843 - time (sec): 6.48 - samples/sec: 1702.29 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:12:27,457 epoch 6 - iter 162/275 - loss 0.04469048 - time (sec): 7.75 - samples/sec: 1711.83 - lr: 0.000015 - momentum: 0.000000 2023-10-13 08:12:28,720 epoch 6 - iter 189/275 - loss 0.03975858 - time (sec): 9.01 - samples/sec: 1731.47 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:12:29,929 epoch 6 - iter 216/275 - loss 0.03696781 - time (sec): 10.22 - samples/sec: 1733.47 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:12:31,186 epoch 6 - iter 243/275 - loss 0.03935542 - time (sec): 11.48 - samples/sec: 1740.67 - lr: 0.000014 - momentum: 0.000000 2023-10-13 08:12:32,458 epoch 6 - iter 270/275 - loss 0.04392720 - time (sec): 12.75 - samples/sec: 1756.94 - lr: 0.000013 - momentum: 0.000000 2023-10-13 08:12:32,682 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:12:32,682 EPOCH 6 done: loss 0.0434 - lr: 0.000013 2023-10-13 08:12:33,386 DEV : loss 0.15735526382923126 - f1-score (micro avg) 0.8684 2023-10-13 08:12:33,391 saving best model 2023-10-13 08:12:33,883 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:12:35,068 epoch 7 - iter 27/275 - loss 0.01718411 - time (sec): 1.18 - samples/sec: 2037.96 - lr: 0.000013 - momentum: 0.000000 2023-10-13 08:12:36,219 epoch 7 - iter 54/275 - loss 0.02632338 - time (sec): 2.33 - samples/sec: 1891.42 - lr: 0.000013 - momentum: 0.000000 2023-10-13 08:12:37,464 epoch 7 - iter 81/275 - loss 0.01990612 - time (sec): 3.58 - samples/sec: 1772.46 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:12:38,653 epoch 7 - iter 108/275 - loss 0.03743037 - time (sec): 4.77 - samples/sec: 1812.69 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:12:39,876 epoch 7 - iter 135/275 - loss 0.03668093 - time (sec): 5.99 - samples/sec: 1784.90 - lr: 0.000012 - momentum: 0.000000 2023-10-13 08:12:41,155 epoch 7 - iter 162/275 - loss 0.03138511 - time (sec): 7.27 - samples/sec: 1791.69 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:12:42,421 epoch 7 - iter 189/275 - loss 0.02939117 - time (sec): 8.54 - samples/sec: 1799.94 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:12:43,718 epoch 7 - iter 216/275 - loss 0.03085943 - time (sec): 9.83 - samples/sec: 1803.62 - lr: 0.000011 - momentum: 0.000000 2023-10-13 08:12:44,987 epoch 7 - iter 243/275 - loss 0.03208756 - time (sec): 11.10 - samples/sec: 1778.53 - lr: 0.000010 - momentum: 0.000000 2023-10-13 08:12:46,313 epoch 7 - iter 270/275 - loss 0.03284565 - time (sec): 12.43 - samples/sec: 1793.59 - lr: 0.000010 - momentum: 0.000000 2023-10-13 08:12:46,537 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:12:46,537 EPOCH 7 done: loss 0.0325 - lr: 0.000010 2023-10-13 08:12:47,234 DEV : loss 0.16358648240566254 - f1-score (micro avg) 0.8633 2023-10-13 08:12:47,239 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:12:48,482 epoch 8 - iter 27/275 - loss 0.02547032 - time (sec): 1.24 - samples/sec: 1880.06 - lr: 0.000010 - momentum: 0.000000 2023-10-13 08:12:49,720 epoch 8 - iter 54/275 - loss 0.02214925 - time (sec): 2.48 - samples/sec: 1855.71 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:12:50,960 epoch 8 - iter 81/275 - loss 0.03531222 - time (sec): 3.72 - samples/sec: 1818.99 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:12:52,130 epoch 8 - iter 108/275 - loss 0.02923483 - time (sec): 4.89 - samples/sec: 1832.89 - lr: 0.000009 - momentum: 0.000000 2023-10-13 08:12:53,363 epoch 8 - iter 135/275 - loss 0.03178601 - time (sec): 6.12 - samples/sec: 1831.24 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:12:54,617 epoch 8 - iter 162/275 - loss 0.03043106 - time (sec): 7.38 - samples/sec: 1811.28 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:12:55,797 epoch 8 - iter 189/275 - loss 0.02728853 - time (sec): 8.56 - samples/sec: 1810.65 - lr: 0.000008 - momentum: 0.000000 2023-10-13 08:12:57,030 epoch 8 - iter 216/275 - loss 0.02688451 - time (sec): 9.79 - samples/sec: 1807.76 - lr: 0.000007 - momentum: 0.000000 2023-10-13 08:12:58,291 epoch 8 - iter 243/275 - loss 0.02524643 - time (sec): 11.05 - samples/sec: 1809.57 - lr: 0.000007 - momentum: 0.000000 2023-10-13 08:12:59,521 epoch 8 - iter 270/275 - loss 0.02538893 - time (sec): 12.28 - samples/sec: 1824.14 - lr: 0.000007 - momentum: 0.000000 2023-10-13 08:12:59,737 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:12:59,738 EPOCH 8 done: loss 0.0250 - lr: 0.000007 2023-10-13 08:13:00,504 DEV : loss 0.16763055324554443 - f1-score (micro avg) 0.8729 2023-10-13 08:13:00,509 saving best model 2023-10-13 08:13:00,952 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:13:02,247 epoch 9 - iter 27/275 - loss 0.01042150 - time (sec): 1.29 - samples/sec: 1822.65 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:13:03,531 epoch 9 - iter 54/275 - loss 0.00705875 - time (sec): 2.57 - samples/sec: 1730.90 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:13:04,847 epoch 9 - iter 81/275 - loss 0.01890319 - time (sec): 3.89 - samples/sec: 1820.81 - lr: 0.000006 - momentum: 0.000000 2023-10-13 08:13:06,113 epoch 9 - iter 108/275 - loss 0.02308810 - time (sec): 5.15 - samples/sec: 1807.70 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:13:07,375 epoch 9 - iter 135/275 - loss 0.02162080 - time (sec): 6.41 - samples/sec: 1789.05 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:13:08,602 epoch 9 - iter 162/275 - loss 0.02289687 - time (sec): 7.64 - samples/sec: 1822.00 - lr: 0.000005 - momentum: 0.000000 2023-10-13 08:13:09,762 epoch 9 - iter 189/275 - loss 0.02061129 - time (sec): 8.80 - samples/sec: 1811.46 - lr: 0.000004 - momentum: 0.000000 2023-10-13 08:13:10,960 epoch 9 - iter 216/275 - loss 0.01818496 - time (sec): 10.00 - samples/sec: 1816.10 - lr: 0.000004 - momentum: 0.000000 2023-10-13 08:13:12,137 epoch 9 - iter 243/275 - loss 0.01863008 - time (sec): 11.18 - samples/sec: 1806.89 - lr: 0.000004 - momentum: 0.000000 2023-10-13 08:13:13,308 epoch 9 - iter 270/275 - loss 0.02041663 - time (sec): 12.35 - samples/sec: 1818.35 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:13:13,526 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:13:13,527 EPOCH 9 done: loss 0.0206 - lr: 0.000003 2023-10-13 08:13:14,226 DEV : loss 0.1689862459897995 - f1-score (micro avg) 0.8694 2023-10-13 08:13:14,230 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:13:15,404 epoch 10 - iter 27/275 - loss 0.00787064 - time (sec): 1.17 - samples/sec: 1973.41 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:13:16,591 epoch 10 - iter 54/275 - loss 0.01301629 - time (sec): 2.36 - samples/sec: 2044.00 - lr: 0.000003 - momentum: 0.000000 2023-10-13 08:13:17,826 epoch 10 - iter 81/275 - loss 0.00982255 - time (sec): 3.59 - samples/sec: 1971.45 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:13:19,103 epoch 10 - iter 108/275 - loss 0.01059154 - time (sec): 4.87 - samples/sec: 1900.79 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:13:20,337 epoch 10 - iter 135/275 - loss 0.01257537 - time (sec): 6.11 - samples/sec: 1903.15 - lr: 0.000002 - momentum: 0.000000 2023-10-13 08:13:21,583 epoch 10 - iter 162/275 - loss 0.01410965 - time (sec): 7.35 - samples/sec: 1858.97 - lr: 0.000001 - momentum: 0.000000 2023-10-13 08:13:22,791 epoch 10 - iter 189/275 - loss 0.01325312 - time (sec): 8.56 - samples/sec: 1849.43 - lr: 0.000001 - momentum: 0.000000 2023-10-13 08:13:24,014 epoch 10 - iter 216/275 - loss 0.01642775 - time (sec): 9.78 - samples/sec: 1838.64 - lr: 0.000001 - momentum: 0.000000 2023-10-13 08:13:25,279 epoch 10 - iter 243/275 - loss 0.01749416 - time (sec): 11.05 - samples/sec: 1833.44 - lr: 0.000000 - momentum: 0.000000 2023-10-13 08:13:26,539 epoch 10 - iter 270/275 - loss 0.01729275 - time (sec): 12.31 - samples/sec: 1819.59 - lr: 0.000000 - momentum: 0.000000 2023-10-13 08:13:26,757 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:13:26,757 EPOCH 10 done: loss 0.0171 - lr: 0.000000 2023-10-13 08:13:27,449 DEV : loss 0.17212548851966858 - f1-score (micro avg) 0.8738 2023-10-13 08:13:27,453 saving best model 2023-10-13 08:13:28,281 ---------------------------------------------------------------------------------------------------- 2023-10-13 08:13:28,283 Loading model from best epoch ... 2023-10-13 08:13:30,102 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-13 08:13:30,894 Results: - F-score (micro) 0.9043 - F-score (macro) 0.5419 - Accuracy 0.8394 By class: precision recall f1-score support scope 0.8883 0.9034 0.8958 176 pers 0.9683 0.9531 0.9606 128 work 0.8421 0.8649 0.8533 74 object 0.0000 0.0000 0.0000 2 loc 0.0000 0.0000 0.0000 2 micro avg 0.9055 0.9031 0.9043 382 macro avg 0.5397 0.5443 0.5419 382 weighted avg 0.8968 0.9031 0.8999 382 2023-10-13 08:13:30,894 ----------------------------------------------------------------------------------------------------