2023-10-13 22:44:08,972 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:08,973 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 22:44:08,973 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:08,973 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-13 22:44:08,973 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:08,973 Train: 7936 sentences 2023-10-13 22:44:08,973 (train_with_dev=False, train_with_test=False) 2023-10-13 22:44:08,973 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:08,973 Training Params: 2023-10-13 22:44:08,973 - learning_rate: "3e-05" 2023-10-13 22:44:08,973 - mini_batch_size: "4" 2023-10-13 22:44:08,973 - max_epochs: "10" 2023-10-13 22:44:08,973 - shuffle: "True" 2023-10-13 22:44:08,973 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:08,973 Plugins: 2023-10-13 22:44:08,973 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:08,974 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 22:44:08,974 - metric: "('micro avg', 'f1-score')" 2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:08,974 Computation: 2023-10-13 22:44:08,974 - compute on device: cuda:0 2023-10-13 22:44:08,974 - embedding storage: none 2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:08,974 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:08,974 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:44:18,081 epoch 1 - iter 198/1984 - loss 1.81064836 - time (sec): 9.11 - samples/sec: 1829.54 - lr: 0.000003 - momentum: 0.000000 2023-10-13 22:44:27,066 epoch 1 - iter 396/1984 - loss 1.08951470 - time (sec): 18.09 - samples/sec: 1811.50 - lr: 0.000006 - momentum: 0.000000 2023-10-13 22:44:36,144 epoch 1 - iter 594/1984 - loss 0.79725090 - time (sec): 27.17 - samples/sec: 1822.29 - lr: 0.000009 - momentum: 0.000000 2023-10-13 22:44:45,070 epoch 1 - iter 792/1984 - loss 0.65426686 - time (sec): 36.10 - samples/sec: 1820.77 - lr: 0.000012 - momentum: 0.000000 2023-10-13 22:44:54,112 epoch 1 - iter 990/1984 - loss 0.55873840 - time (sec): 45.14 - samples/sec: 1826.09 - lr: 0.000015 - momentum: 0.000000 2023-10-13 22:45:02,966 epoch 1 - iter 1188/1984 - loss 0.49032838 - time (sec): 53.99 - samples/sec: 1833.75 - lr: 0.000018 - momentum: 0.000000 2023-10-13 22:45:11,931 epoch 1 - iter 1386/1984 - loss 0.44094947 - time (sec): 62.96 - samples/sec: 1830.34 - lr: 0.000021 - momentum: 0.000000 2023-10-13 22:45:20,963 epoch 1 - iter 1584/1984 - loss 0.40474175 - time (sec): 71.99 - samples/sec: 1830.66 - lr: 0.000024 - momentum: 0.000000 2023-10-13 22:45:30,256 epoch 1 - iter 1782/1984 - loss 0.37525747 - time (sec): 81.28 - samples/sec: 1818.44 - lr: 0.000027 - momentum: 0.000000 2023-10-13 22:45:39,648 epoch 1 - iter 1980/1984 - loss 0.35281987 - time (sec): 90.67 - samples/sec: 1805.98 - lr: 0.000030 - momentum: 0.000000 2023-10-13 22:45:39,839 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:45:39,839 EPOCH 1 done: loss 0.3526 - lr: 0.000030 2023-10-13 22:45:42,914 DEV : loss 0.10146976262331009 - f1-score (micro avg) 0.7264 2023-10-13 22:45:42,937 saving best model 2023-10-13 22:45:43,336 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:45:52,299 epoch 2 - iter 198/1984 - loss 0.13085115 - time (sec): 8.96 - samples/sec: 1965.02 - lr: 0.000030 - momentum: 0.000000 2023-10-13 22:46:01,550 epoch 2 - iter 396/1984 - loss 0.12180186 - time (sec): 18.21 - samples/sec: 1845.38 - lr: 0.000029 - momentum: 0.000000 2023-10-13 22:46:10,520 epoch 2 - iter 594/1984 - loss 0.11729603 - time (sec): 27.18 - samples/sec: 1843.65 - lr: 0.000029 - momentum: 0.000000 2023-10-13 22:46:19,430 epoch 2 - iter 792/1984 - loss 0.11624537 - time (sec): 36.09 - samples/sec: 1816.81 - lr: 0.000029 - momentum: 0.000000 2023-10-13 22:46:28,458 epoch 2 - iter 990/1984 - loss 0.11635377 - time (sec): 45.12 - samples/sec: 1810.56 - lr: 0.000028 - momentum: 0.000000 2023-10-13 22:46:37,650 epoch 2 - iter 1188/1984 - loss 0.11405465 - time (sec): 54.31 - samples/sec: 1800.34 - lr: 0.000028 - momentum: 0.000000 2023-10-13 22:46:46,738 epoch 2 - iter 1386/1984 - loss 0.11342189 - time (sec): 63.40 - samples/sec: 1795.44 - lr: 0.000028 - momentum: 0.000000 2023-10-13 22:46:55,776 epoch 2 - iter 1584/1984 - loss 0.11424700 - time (sec): 72.44 - samples/sec: 1803.73 - lr: 0.000027 - momentum: 0.000000 2023-10-13 22:47:04,814 epoch 2 - iter 1782/1984 - loss 0.11408106 - time (sec): 81.48 - samples/sec: 1794.31 - lr: 0.000027 - momentum: 0.000000 2023-10-13 22:47:13,932 epoch 2 - iter 1980/1984 - loss 0.11249722 - time (sec): 90.59 - samples/sec: 1807.49 - lr: 0.000027 - momentum: 0.000000 2023-10-13 22:47:14,114 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:47:14,114 EPOCH 2 done: loss 0.1124 - lr: 0.000027 2023-10-13 22:47:17,587 DEV : loss 0.09548649936914444 - f1-score (micro avg) 0.744 2023-10-13 22:47:17,608 saving best model 2023-10-13 22:47:18,148 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:47:27,115 epoch 3 - iter 198/1984 - loss 0.06800668 - time (sec): 8.96 - samples/sec: 1822.91 - lr: 0.000026 - momentum: 0.000000 2023-10-13 22:47:36,104 epoch 3 - iter 396/1984 - loss 0.07109942 - time (sec): 17.95 - samples/sec: 1798.42 - lr: 0.000026 - momentum: 0.000000 2023-10-13 22:47:45,084 epoch 3 - iter 594/1984 - loss 0.07668389 - time (sec): 26.93 - samples/sec: 1795.77 - lr: 0.000026 - momentum: 0.000000 2023-10-13 22:47:54,089 epoch 3 - iter 792/1984 - loss 0.07789366 - time (sec): 35.94 - samples/sec: 1817.49 - lr: 0.000025 - momentum: 0.000000 2023-10-13 22:48:03,169 epoch 3 - iter 990/1984 - loss 0.07895567 - time (sec): 45.02 - samples/sec: 1825.49 - lr: 0.000025 - momentum: 0.000000 2023-10-13 22:48:12,181 epoch 3 - iter 1188/1984 - loss 0.08032932 - time (sec): 54.03 - samples/sec: 1818.24 - lr: 0.000025 - momentum: 0.000000 2023-10-13 22:48:21,733 epoch 3 - iter 1386/1984 - loss 0.08285120 - time (sec): 63.58 - samples/sec: 1802.90 - lr: 0.000024 - momentum: 0.000000 2023-10-13 22:48:30,700 epoch 3 - iter 1584/1984 - loss 0.08388123 - time (sec): 72.55 - samples/sec: 1799.06 - lr: 0.000024 - momentum: 0.000000 2023-10-13 22:48:39,738 epoch 3 - iter 1782/1984 - loss 0.08563997 - time (sec): 81.59 - samples/sec: 1800.26 - lr: 0.000024 - momentum: 0.000000 2023-10-13 22:48:48,772 epoch 3 - iter 1980/1984 - loss 0.08470780 - time (sec): 90.62 - samples/sec: 1805.95 - lr: 0.000023 - momentum: 0.000000 2023-10-13 22:48:48,950 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:48:48,950 EPOCH 3 done: loss 0.0847 - lr: 0.000023 2023-10-13 22:48:52,399 DEV : loss 0.12796364724636078 - f1-score (micro avg) 0.7407 2023-10-13 22:48:52,419 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:49:01,819 epoch 4 - iter 198/1984 - loss 0.05800278 - time (sec): 9.40 - samples/sec: 1812.92 - lr: 0.000023 - momentum: 0.000000 2023-10-13 22:49:10,901 epoch 4 - iter 396/1984 - loss 0.05641167 - time (sec): 18.48 - samples/sec: 1793.44 - lr: 0.000023 - momentum: 0.000000 2023-10-13 22:49:19,796 epoch 4 - iter 594/1984 - loss 0.05743702 - time (sec): 27.38 - samples/sec: 1787.23 - lr: 0.000022 - momentum: 0.000000 2023-10-13 22:49:28,726 epoch 4 - iter 792/1984 - loss 0.05915402 - time (sec): 36.31 - samples/sec: 1798.85 - lr: 0.000022 - momentum: 0.000000 2023-10-13 22:49:37,841 epoch 4 - iter 990/1984 - loss 0.05947704 - time (sec): 45.42 - samples/sec: 1795.29 - lr: 0.000022 - momentum: 0.000000 2023-10-13 22:49:46,872 epoch 4 - iter 1188/1984 - loss 0.06239465 - time (sec): 54.45 - samples/sec: 1801.43 - lr: 0.000021 - momentum: 0.000000 2023-10-13 22:49:55,856 epoch 4 - iter 1386/1984 - loss 0.06300882 - time (sec): 63.44 - samples/sec: 1807.04 - lr: 0.000021 - momentum: 0.000000 2023-10-13 22:50:04,872 epoch 4 - iter 1584/1984 - loss 0.06319248 - time (sec): 72.45 - samples/sec: 1804.06 - lr: 0.000021 - momentum: 0.000000 2023-10-13 22:50:13,909 epoch 4 - iter 1782/1984 - loss 0.06395252 - time (sec): 81.49 - samples/sec: 1810.75 - lr: 0.000020 - momentum: 0.000000 2023-10-13 22:50:22,853 epoch 4 - iter 1980/1984 - loss 0.06299855 - time (sec): 90.43 - samples/sec: 1811.89 - lr: 0.000020 - momentum: 0.000000 2023-10-13 22:50:23,030 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:50:23,030 EPOCH 4 done: loss 0.0631 - lr: 0.000020 2023-10-13 22:50:26,447 DEV : loss 0.18204569816589355 - f1-score (micro avg) 0.738 2023-10-13 22:50:26,468 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:50:35,948 epoch 5 - iter 198/1984 - loss 0.04150094 - time (sec): 9.48 - samples/sec: 1789.33 - lr: 0.000020 - momentum: 0.000000 2023-10-13 22:50:45,154 epoch 5 - iter 396/1984 - loss 0.04233067 - time (sec): 18.69 - samples/sec: 1769.03 - lr: 0.000019 - momentum: 0.000000 2023-10-13 22:50:54,257 epoch 5 - iter 594/1984 - loss 0.04297481 - time (sec): 27.79 - samples/sec: 1815.26 - lr: 0.000019 - momentum: 0.000000 2023-10-13 22:51:03,331 epoch 5 - iter 792/1984 - loss 0.04244711 - time (sec): 36.86 - samples/sec: 1802.29 - lr: 0.000019 - momentum: 0.000000 2023-10-13 22:51:12,472 epoch 5 - iter 990/1984 - loss 0.04326527 - time (sec): 46.00 - samples/sec: 1802.41 - lr: 0.000018 - momentum: 0.000000 2023-10-13 22:51:21,600 epoch 5 - iter 1188/1984 - loss 0.04560535 - time (sec): 55.13 - samples/sec: 1794.02 - lr: 0.000018 - momentum: 0.000000 2023-10-13 22:51:30,456 epoch 5 - iter 1386/1984 - loss 0.04527611 - time (sec): 63.99 - samples/sec: 1788.59 - lr: 0.000018 - momentum: 0.000000 2023-10-13 22:51:39,401 epoch 5 - iter 1584/1984 - loss 0.04572642 - time (sec): 72.93 - samples/sec: 1790.69 - lr: 0.000017 - momentum: 0.000000 2023-10-13 22:51:48,362 epoch 5 - iter 1782/1984 - loss 0.04653709 - time (sec): 81.89 - samples/sec: 1806.98 - lr: 0.000017 - momentum: 0.000000 2023-10-13 22:51:57,204 epoch 5 - iter 1980/1984 - loss 0.04705024 - time (sec): 90.73 - samples/sec: 1801.81 - lr: 0.000017 - momentum: 0.000000 2023-10-13 22:51:57,437 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:51:57,437 EPOCH 5 done: loss 0.0470 - lr: 0.000017 2023-10-13 22:52:01,269 DEV : loss 0.18338747322559357 - f1-score (micro avg) 0.7485 2023-10-13 22:52:01,290 saving best model 2023-10-13 22:52:01,805 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:52:10,662 epoch 6 - iter 198/1984 - loss 0.03423276 - time (sec): 8.85 - samples/sec: 1775.29 - lr: 0.000016 - momentum: 0.000000 2023-10-13 22:52:19,672 epoch 6 - iter 396/1984 - loss 0.03504659 - time (sec): 17.86 - samples/sec: 1783.45 - lr: 0.000016 - momentum: 0.000000 2023-10-13 22:52:28,638 epoch 6 - iter 594/1984 - loss 0.03522723 - time (sec): 26.83 - samples/sec: 1801.61 - lr: 0.000016 - momentum: 0.000000 2023-10-13 22:52:37,746 epoch 6 - iter 792/1984 - loss 0.03512497 - time (sec): 35.93 - samples/sec: 1812.74 - lr: 0.000015 - momentum: 0.000000 2023-10-13 22:52:46,705 epoch 6 - iter 990/1984 - loss 0.03574142 - time (sec): 44.89 - samples/sec: 1815.89 - lr: 0.000015 - momentum: 0.000000 2023-10-13 22:52:55,725 epoch 6 - iter 1188/1984 - loss 0.03636170 - time (sec): 53.91 - samples/sec: 1817.02 - lr: 0.000015 - momentum: 0.000000 2023-10-13 22:53:04,837 epoch 6 - iter 1386/1984 - loss 0.03573438 - time (sec): 63.03 - samples/sec: 1821.47 - lr: 0.000014 - momentum: 0.000000 2023-10-13 22:53:13,757 epoch 6 - iter 1584/1984 - loss 0.03570621 - time (sec): 71.95 - samples/sec: 1821.80 - lr: 0.000014 - momentum: 0.000000 2023-10-13 22:53:23,051 epoch 6 - iter 1782/1984 - loss 0.03561765 - time (sec): 81.24 - samples/sec: 1803.74 - lr: 0.000014 - momentum: 0.000000 2023-10-13 22:53:32,057 epoch 6 - iter 1980/1984 - loss 0.03560613 - time (sec): 90.25 - samples/sec: 1811.62 - lr: 0.000013 - momentum: 0.000000 2023-10-13 22:53:32,243 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:53:32,243 EPOCH 6 done: loss 0.0355 - lr: 0.000013 2023-10-13 22:53:35,667 DEV : loss 0.19200846552848816 - f1-score (micro avg) 0.7571 2023-10-13 22:53:35,690 saving best model 2023-10-13 22:53:36,227 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:53:45,464 epoch 7 - iter 198/1984 - loss 0.01991158 - time (sec): 9.23 - samples/sec: 1858.36 - lr: 0.000013 - momentum: 0.000000 2023-10-13 22:53:54,516 epoch 7 - iter 396/1984 - loss 0.02651805 - time (sec): 18.28 - samples/sec: 1844.81 - lr: 0.000013 - momentum: 0.000000 2023-10-13 22:54:03,473 epoch 7 - iter 594/1984 - loss 0.02637781 - time (sec): 27.24 - samples/sec: 1848.97 - lr: 0.000012 - momentum: 0.000000 2023-10-13 22:54:12,481 epoch 7 - iter 792/1984 - loss 0.02677852 - time (sec): 36.25 - samples/sec: 1845.67 - lr: 0.000012 - momentum: 0.000000 2023-10-13 22:54:21,402 epoch 7 - iter 990/1984 - loss 0.02881749 - time (sec): 45.17 - samples/sec: 1842.80 - lr: 0.000012 - momentum: 0.000000 2023-10-13 22:54:30,402 epoch 7 - iter 1188/1984 - loss 0.02858223 - time (sec): 54.17 - samples/sec: 1853.37 - lr: 0.000011 - momentum: 0.000000 2023-10-13 22:54:39,379 epoch 7 - iter 1386/1984 - loss 0.02830102 - time (sec): 63.15 - samples/sec: 1833.93 - lr: 0.000011 - momentum: 0.000000 2023-10-13 22:54:48,384 epoch 7 - iter 1584/1984 - loss 0.02747017 - time (sec): 72.15 - samples/sec: 1827.82 - lr: 0.000011 - momentum: 0.000000 2023-10-13 22:54:57,360 epoch 7 - iter 1782/1984 - loss 0.02717714 - time (sec): 81.13 - samples/sec: 1826.93 - lr: 0.000010 - momentum: 0.000000 2023-10-13 22:55:06,509 epoch 7 - iter 1980/1984 - loss 0.02758899 - time (sec): 90.28 - samples/sec: 1812.63 - lr: 0.000010 - momentum: 0.000000 2023-10-13 22:55:06,690 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:55:06,690 EPOCH 7 done: loss 0.0275 - lr: 0.000010 2023-10-13 22:55:10,051 DEV : loss 0.20744635164737701 - f1-score (micro avg) 0.7666 2023-10-13 22:55:10,071 saving best model 2023-10-13 22:55:10,919 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:55:19,880 epoch 8 - iter 198/1984 - loss 0.02610314 - time (sec): 8.96 - samples/sec: 1850.88 - lr: 0.000010 - momentum: 0.000000 2023-10-13 22:55:28,864 epoch 8 - iter 396/1984 - loss 0.02132016 - time (sec): 17.94 - samples/sec: 1827.81 - lr: 0.000009 - momentum: 0.000000 2023-10-13 22:55:38,088 epoch 8 - iter 594/1984 - loss 0.02012992 - time (sec): 27.17 - samples/sec: 1814.18 - lr: 0.000009 - momentum: 0.000000 2023-10-13 22:55:47,028 epoch 8 - iter 792/1984 - loss 0.02027899 - time (sec): 36.11 - samples/sec: 1807.99 - lr: 0.000009 - momentum: 0.000000 2023-10-13 22:55:56,139 epoch 8 - iter 990/1984 - loss 0.01932362 - time (sec): 45.22 - samples/sec: 1797.05 - lr: 0.000008 - momentum: 0.000000 2023-10-13 22:56:05,665 epoch 8 - iter 1188/1984 - loss 0.02059177 - time (sec): 54.74 - samples/sec: 1790.75 - lr: 0.000008 - momentum: 0.000000 2023-10-13 22:56:14,696 epoch 8 - iter 1386/1984 - loss 0.02038701 - time (sec): 63.78 - samples/sec: 1802.34 - lr: 0.000008 - momentum: 0.000000 2023-10-13 22:56:23,565 epoch 8 - iter 1584/1984 - loss 0.02069282 - time (sec): 72.64 - samples/sec: 1803.00 - lr: 0.000007 - momentum: 0.000000 2023-10-13 22:56:32,647 epoch 8 - iter 1782/1984 - loss 0.02056342 - time (sec): 81.73 - samples/sec: 1798.72 - lr: 0.000007 - momentum: 0.000000 2023-10-13 22:56:41,597 epoch 8 - iter 1980/1984 - loss 0.02020247 - time (sec): 90.68 - samples/sec: 1804.78 - lr: 0.000007 - momentum: 0.000000 2023-10-13 22:56:41,779 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:56:41,779 EPOCH 8 done: loss 0.0203 - lr: 0.000007 2023-10-13 22:56:45,247 DEV : loss 0.2160894274711609 - f1-score (micro avg) 0.7539 2023-10-13 22:56:45,268 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:56:54,505 epoch 9 - iter 198/1984 - loss 0.00746271 - time (sec): 9.24 - samples/sec: 1666.05 - lr: 0.000006 - momentum: 0.000000 2023-10-13 22:57:03,468 epoch 9 - iter 396/1984 - loss 0.01246335 - time (sec): 18.20 - samples/sec: 1713.75 - lr: 0.000006 - momentum: 0.000000 2023-10-13 22:57:12,460 epoch 9 - iter 594/1984 - loss 0.01222490 - time (sec): 27.19 - samples/sec: 1776.85 - lr: 0.000006 - momentum: 0.000000 2023-10-13 22:57:21,470 epoch 9 - iter 792/1984 - loss 0.01220938 - time (sec): 36.20 - samples/sec: 1795.95 - lr: 0.000005 - momentum: 0.000000 2023-10-13 22:57:30,847 epoch 9 - iter 990/1984 - loss 0.01368129 - time (sec): 45.58 - samples/sec: 1814.95 - lr: 0.000005 - momentum: 0.000000 2023-10-13 22:57:39,889 epoch 9 - iter 1188/1984 - loss 0.01266492 - time (sec): 54.62 - samples/sec: 1814.26 - lr: 0.000005 - momentum: 0.000000 2023-10-13 22:57:49,004 epoch 9 - iter 1386/1984 - loss 0.01257140 - time (sec): 63.74 - samples/sec: 1808.68 - lr: 0.000004 - momentum: 0.000000 2023-10-13 22:57:58,054 epoch 9 - iter 1584/1984 - loss 0.01226365 - time (sec): 72.79 - samples/sec: 1807.51 - lr: 0.000004 - momentum: 0.000000 2023-10-13 22:58:06,967 epoch 9 - iter 1782/1984 - loss 0.01268987 - time (sec): 81.70 - samples/sec: 1802.17 - lr: 0.000004 - momentum: 0.000000 2023-10-13 22:58:15,967 epoch 9 - iter 1980/1984 - loss 0.01248644 - time (sec): 90.70 - samples/sec: 1804.81 - lr: 0.000003 - momentum: 0.000000 2023-10-13 22:58:16,146 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:58:16,146 EPOCH 9 done: loss 0.0126 - lr: 0.000003 2023-10-13 22:58:19,625 DEV : loss 0.22268585860729218 - f1-score (micro avg) 0.7629 2023-10-13 22:58:19,646 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:58:28,696 epoch 10 - iter 198/1984 - loss 0.01007200 - time (sec): 9.05 - samples/sec: 1701.45 - lr: 0.000003 - momentum: 0.000000 2023-10-13 22:58:37,663 epoch 10 - iter 396/1984 - loss 0.00885874 - time (sec): 18.02 - samples/sec: 1751.39 - lr: 0.000003 - momentum: 0.000000 2023-10-13 22:58:46,584 epoch 10 - iter 594/1984 - loss 0.00972978 - time (sec): 26.94 - samples/sec: 1752.48 - lr: 0.000002 - momentum: 0.000000 2023-10-13 22:58:55,648 epoch 10 - iter 792/1984 - loss 0.00994088 - time (sec): 36.00 - samples/sec: 1761.21 - lr: 0.000002 - momentum: 0.000000 2023-10-13 22:59:04,705 epoch 10 - iter 990/1984 - loss 0.01002788 - time (sec): 45.06 - samples/sec: 1769.36 - lr: 0.000002 - momentum: 0.000000 2023-10-13 22:59:13,711 epoch 10 - iter 1188/1984 - loss 0.00996187 - time (sec): 54.06 - samples/sec: 1790.42 - lr: 0.000001 - momentum: 0.000000 2023-10-13 22:59:22,835 epoch 10 - iter 1386/1984 - loss 0.00984624 - time (sec): 63.19 - samples/sec: 1811.97 - lr: 0.000001 - momentum: 0.000000 2023-10-13 22:59:31,920 epoch 10 - iter 1584/1984 - loss 0.00915830 - time (sec): 72.27 - samples/sec: 1815.67 - lr: 0.000001 - momentum: 0.000000 2023-10-13 22:59:41,048 epoch 10 - iter 1782/1984 - loss 0.00916070 - time (sec): 81.40 - samples/sec: 1810.01 - lr: 0.000000 - momentum: 0.000000 2023-10-13 22:59:50,115 epoch 10 - iter 1980/1984 - loss 0.00968803 - time (sec): 90.47 - samples/sec: 1809.43 - lr: 0.000000 - momentum: 0.000000 2023-10-13 22:59:50,294 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:59:50,294 EPOCH 10 done: loss 0.0097 - lr: 0.000000 2023-10-13 22:59:54,151 DEV : loss 0.2287728190422058 - f1-score (micro avg) 0.7659 2023-10-13 22:59:54,607 ---------------------------------------------------------------------------------------------------- 2023-10-13 22:59:54,608 Loading model from best epoch ... 2023-10-13 22:59:56,028 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-13 22:59:59,358 Results: - F-score (micro) 0.7872 - F-score (macro) 0.6794 - Accuracy 0.6667 By class: precision recall f1-score support LOC 0.8251 0.8718 0.8478 655 PER 0.7406 0.7937 0.7662 223 ORG 0.5915 0.3307 0.4242 127 micro avg 0.7884 0.7861 0.7872 1005 macro avg 0.7191 0.6654 0.6794 1005 weighted avg 0.7769 0.7861 0.7762 1005 2023-10-13 22:59:59,358 ----------------------------------------------------------------------------------------------------