2023-10-13 21:09:52,051 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:09:52,052 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-13 21:09:52,052 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:09:52,052 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-13 21:09:52,052 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:09:52,052 Train: 7936 sentences 2023-10-13 21:09:52,052 (train_with_dev=False, train_with_test=False) 2023-10-13 21:09:52,052 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:09:52,052 Training Params: 2023-10-13 21:09:52,053 - learning_rate: "5e-05" 2023-10-13 21:09:52,053 - mini_batch_size: "4" 2023-10-13 21:09:52,053 - max_epochs: "10" 2023-10-13 21:09:52,053 - shuffle: "True" 2023-10-13 21:09:52,053 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:09:52,053 Plugins: 2023-10-13 21:09:52,053 - LinearScheduler | warmup_fraction: '0.1' 2023-10-13 21:09:52,053 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:09:52,053 Final evaluation on model from best epoch (best-model.pt) 2023-10-13 21:09:52,053 - metric: "('micro avg', 'f1-score')" 2023-10-13 21:09:52,053 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:09:52,053 Computation: 2023-10-13 21:09:52,053 - compute on device: cuda:0 2023-10-13 21:09:52,053 - embedding storage: none 2023-10-13 21:09:52,053 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:09:52,053 Model training base path: "hmbench-icdar/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-13 21:09:52,053 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:09:52,053 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:10:01,654 epoch 1 - iter 198/1984 - loss 1.57314828 - time (sec): 9.60 - samples/sec: 1674.91 - lr: 0.000005 - momentum: 0.000000 2023-10-13 21:10:10,891 epoch 1 - iter 396/1984 - loss 0.94482932 - time (sec): 18.84 - samples/sec: 1728.91 - lr: 0.000010 - momentum: 0.000000 2023-10-13 21:10:19,877 epoch 1 - iter 594/1984 - loss 0.70656946 - time (sec): 27.82 - samples/sec: 1767.52 - lr: 0.000015 - momentum: 0.000000 2023-10-13 21:10:28,806 epoch 1 - iter 792/1984 - loss 0.57589195 - time (sec): 36.75 - samples/sec: 1776.73 - lr: 0.000020 - momentum: 0.000000 2023-10-13 21:10:37,770 epoch 1 - iter 990/1984 - loss 0.49394510 - time (sec): 45.72 - samples/sec: 1780.76 - lr: 0.000025 - momentum: 0.000000 2023-10-13 21:10:47,231 epoch 1 - iter 1188/1984 - loss 0.43440653 - time (sec): 55.18 - samples/sec: 1768.08 - lr: 0.000030 - momentum: 0.000000 2023-10-13 21:10:56,769 epoch 1 - iter 1386/1984 - loss 0.39123991 - time (sec): 64.71 - samples/sec: 1772.12 - lr: 0.000035 - momentum: 0.000000 2023-10-13 21:11:05,879 epoch 1 - iter 1584/1984 - loss 0.35692182 - time (sec): 73.82 - samples/sec: 1790.27 - lr: 0.000040 - momentum: 0.000000 2023-10-13 21:11:14,766 epoch 1 - iter 1782/1984 - loss 0.33500315 - time (sec): 82.71 - samples/sec: 1790.47 - lr: 0.000045 - momentum: 0.000000 2023-10-13 21:11:23,785 epoch 1 - iter 1980/1984 - loss 0.31600489 - time (sec): 91.73 - samples/sec: 1785.44 - lr: 0.000050 - momentum: 0.000000 2023-10-13 21:11:23,964 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:11:23,964 EPOCH 1 done: loss 0.3157 - lr: 0.000050 2023-10-13 21:11:27,089 DEV : loss 0.10700166970491409 - f1-score (micro avg) 0.6774 2023-10-13 21:11:27,109 saving best model 2023-10-13 21:11:27,560 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:11:36,750 epoch 2 - iter 198/1984 - loss 0.12449949 - time (sec): 9.19 - samples/sec: 1878.49 - lr: 0.000049 - momentum: 0.000000 2023-10-13 21:11:45,799 epoch 2 - iter 396/1984 - loss 0.12255206 - time (sec): 18.24 - samples/sec: 1777.92 - lr: 0.000049 - momentum: 0.000000 2023-10-13 21:11:55,488 epoch 2 - iter 594/1984 - loss 0.12029581 - time (sec): 27.93 - samples/sec: 1784.54 - lr: 0.000048 - momentum: 0.000000 2023-10-13 21:12:04,857 epoch 2 - iter 792/1984 - loss 0.11861062 - time (sec): 37.29 - samples/sec: 1736.16 - lr: 0.000048 - momentum: 0.000000 2023-10-13 21:12:14,075 epoch 2 - iter 990/1984 - loss 0.11960110 - time (sec): 46.51 - samples/sec: 1766.04 - lr: 0.000047 - momentum: 0.000000 2023-10-13 21:12:22,965 epoch 2 - iter 1188/1984 - loss 0.11657540 - time (sec): 55.40 - samples/sec: 1770.76 - lr: 0.000047 - momentum: 0.000000 2023-10-13 21:12:32,062 epoch 2 - iter 1386/1984 - loss 0.11794534 - time (sec): 64.50 - samples/sec: 1776.69 - lr: 0.000046 - momentum: 0.000000 2023-10-13 21:12:41,040 epoch 2 - iter 1584/1984 - loss 0.11799208 - time (sec): 73.48 - samples/sec: 1776.65 - lr: 0.000046 - momentum: 0.000000 2023-10-13 21:12:49,960 epoch 2 - iter 1782/1984 - loss 0.11734119 - time (sec): 82.40 - samples/sec: 1787.75 - lr: 0.000045 - momentum: 0.000000 2023-10-13 21:12:58,919 epoch 2 - iter 1980/1984 - loss 0.11668880 - time (sec): 91.36 - samples/sec: 1792.51 - lr: 0.000044 - momentum: 0.000000 2023-10-13 21:12:59,093 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:12:59,093 EPOCH 2 done: loss 0.1166 - lr: 0.000044 2023-10-13 21:13:02,923 DEV : loss 0.098160520195961 - f1-score (micro avg) 0.7029 2023-10-13 21:13:02,942 saving best model 2023-10-13 21:13:03,474 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:13:12,412 epoch 3 - iter 198/1984 - loss 0.08773465 - time (sec): 8.94 - samples/sec: 1833.04 - lr: 0.000044 - momentum: 0.000000 2023-10-13 21:13:21,418 epoch 3 - iter 396/1984 - loss 0.08439184 - time (sec): 17.94 - samples/sec: 1776.93 - lr: 0.000043 - momentum: 0.000000 2023-10-13 21:13:30,440 epoch 3 - iter 594/1984 - loss 0.07868581 - time (sec): 26.96 - samples/sec: 1833.42 - lr: 0.000043 - momentum: 0.000000 2023-10-13 21:13:39,685 epoch 3 - iter 792/1984 - loss 0.08531933 - time (sec): 36.21 - samples/sec: 1840.58 - lr: 0.000042 - momentum: 0.000000 2023-10-13 21:13:48,778 epoch 3 - iter 990/1984 - loss 0.08514713 - time (sec): 45.30 - samples/sec: 1828.82 - lr: 0.000042 - momentum: 0.000000 2023-10-13 21:13:57,734 epoch 3 - iter 1188/1984 - loss 0.08825157 - time (sec): 54.26 - samples/sec: 1825.18 - lr: 0.000041 - momentum: 0.000000 2023-10-13 21:14:06,673 epoch 3 - iter 1386/1984 - loss 0.08814664 - time (sec): 63.20 - samples/sec: 1819.58 - lr: 0.000041 - momentum: 0.000000 2023-10-13 21:14:15,704 epoch 3 - iter 1584/1984 - loss 0.09009066 - time (sec): 72.23 - samples/sec: 1820.88 - lr: 0.000040 - momentum: 0.000000 2023-10-13 21:14:24,715 epoch 3 - iter 1782/1984 - loss 0.09006858 - time (sec): 81.24 - samples/sec: 1818.73 - lr: 0.000039 - momentum: 0.000000 2023-10-13 21:14:33,667 epoch 3 - iter 1980/1984 - loss 0.08914040 - time (sec): 90.19 - samples/sec: 1813.41 - lr: 0.000039 - momentum: 0.000000 2023-10-13 21:14:33,845 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:14:33,845 EPOCH 3 done: loss 0.0890 - lr: 0.000039 2023-10-13 21:14:37,406 DEV : loss 0.12662376463413239 - f1-score (micro avg) 0.7339 2023-10-13 21:14:37,432 saving best model 2023-10-13 21:14:38,036 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:14:47,082 epoch 4 - iter 198/1984 - loss 0.06364157 - time (sec): 9.04 - samples/sec: 1828.28 - lr: 0.000038 - momentum: 0.000000 2023-10-13 21:14:56,152 epoch 4 - iter 396/1984 - loss 0.06757552 - time (sec): 18.11 - samples/sec: 1835.62 - lr: 0.000038 - momentum: 0.000000 2023-10-13 21:15:05,034 epoch 4 - iter 594/1984 - loss 0.06841129 - time (sec): 26.99 - samples/sec: 1817.44 - lr: 0.000037 - momentum: 0.000000 2023-10-13 21:15:13,994 epoch 4 - iter 792/1984 - loss 0.06652070 - time (sec): 35.95 - samples/sec: 1809.83 - lr: 0.000037 - momentum: 0.000000 2023-10-13 21:15:23,002 epoch 4 - iter 990/1984 - loss 0.06676259 - time (sec): 44.96 - samples/sec: 1810.97 - lr: 0.000036 - momentum: 0.000000 2023-10-13 21:15:32,008 epoch 4 - iter 1188/1984 - loss 0.06542120 - time (sec): 53.97 - samples/sec: 1810.27 - lr: 0.000036 - momentum: 0.000000 2023-10-13 21:15:40,984 epoch 4 - iter 1386/1984 - loss 0.06788585 - time (sec): 62.94 - samples/sec: 1810.85 - lr: 0.000035 - momentum: 0.000000 2023-10-13 21:15:50,127 epoch 4 - iter 1584/1984 - loss 0.06753735 - time (sec): 72.09 - samples/sec: 1802.82 - lr: 0.000034 - momentum: 0.000000 2023-10-13 21:15:59,268 epoch 4 - iter 1782/1984 - loss 0.06658891 - time (sec): 81.23 - samples/sec: 1804.20 - lr: 0.000034 - momentum: 0.000000 2023-10-13 21:16:08,240 epoch 4 - iter 1980/1984 - loss 0.06689595 - time (sec): 90.20 - samples/sec: 1815.32 - lr: 0.000033 - momentum: 0.000000 2023-10-13 21:16:08,414 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:16:08,414 EPOCH 4 done: loss 0.0668 - lr: 0.000033 2023-10-13 21:16:12,009 DEV : loss 0.1626797765493393 - f1-score (micro avg) 0.7426 2023-10-13 21:16:12,038 saving best model 2023-10-13 21:16:12,616 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:16:22,166 epoch 5 - iter 198/1984 - loss 0.05707626 - time (sec): 9.55 - samples/sec: 1742.71 - lr: 0.000033 - momentum: 0.000000 2023-10-13 21:16:31,599 epoch 5 - iter 396/1984 - loss 0.05057017 - time (sec): 18.98 - samples/sec: 1736.15 - lr: 0.000032 - momentum: 0.000000 2023-10-13 21:16:40,701 epoch 5 - iter 594/1984 - loss 0.05284113 - time (sec): 28.08 - samples/sec: 1754.35 - lr: 0.000032 - momentum: 0.000000 2023-10-13 21:16:49,956 epoch 5 - iter 792/1984 - loss 0.05196859 - time (sec): 37.34 - samples/sec: 1785.68 - lr: 0.000031 - momentum: 0.000000 2023-10-13 21:16:58,899 epoch 5 - iter 990/1984 - loss 0.05288056 - time (sec): 46.28 - samples/sec: 1797.84 - lr: 0.000031 - momentum: 0.000000 2023-10-13 21:17:07,821 epoch 5 - iter 1188/1984 - loss 0.05233510 - time (sec): 55.20 - samples/sec: 1797.48 - lr: 0.000030 - momentum: 0.000000 2023-10-13 21:17:16,772 epoch 5 - iter 1386/1984 - loss 0.05315803 - time (sec): 64.15 - samples/sec: 1794.32 - lr: 0.000029 - momentum: 0.000000 2023-10-13 21:17:25,950 epoch 5 - iter 1584/1984 - loss 0.05266700 - time (sec): 73.33 - samples/sec: 1794.36 - lr: 0.000029 - momentum: 0.000000 2023-10-13 21:17:34,933 epoch 5 - iter 1782/1984 - loss 0.05326372 - time (sec): 82.31 - samples/sec: 1798.69 - lr: 0.000028 - momentum: 0.000000 2023-10-13 21:17:43,886 epoch 5 - iter 1980/1984 - loss 0.05366528 - time (sec): 91.27 - samples/sec: 1792.47 - lr: 0.000028 - momentum: 0.000000 2023-10-13 21:17:44,081 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:17:44,081 EPOCH 5 done: loss 0.0536 - lr: 0.000028 2023-10-13 21:17:47,992 DEV : loss 0.17723220586776733 - f1-score (micro avg) 0.7561 2023-10-13 21:17:48,015 saving best model 2023-10-13 21:17:48,548 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:17:57,694 epoch 6 - iter 198/1984 - loss 0.03763551 - time (sec): 9.15 - samples/sec: 1813.48 - lr: 0.000027 - momentum: 0.000000 2023-10-13 21:18:06,904 epoch 6 - iter 396/1984 - loss 0.03893849 - time (sec): 18.36 - samples/sec: 1842.25 - lr: 0.000027 - momentum: 0.000000 2023-10-13 21:18:16,141 epoch 6 - iter 594/1984 - loss 0.03776879 - time (sec): 27.59 - samples/sec: 1800.77 - lr: 0.000026 - momentum: 0.000000 2023-10-13 21:18:25,473 epoch 6 - iter 792/1984 - loss 0.03719490 - time (sec): 36.92 - samples/sec: 1775.51 - lr: 0.000026 - momentum: 0.000000 2023-10-13 21:18:34,661 epoch 6 - iter 990/1984 - loss 0.03762218 - time (sec): 46.11 - samples/sec: 1776.83 - lr: 0.000025 - momentum: 0.000000 2023-10-13 21:18:43,620 epoch 6 - iter 1188/1984 - loss 0.03873105 - time (sec): 55.07 - samples/sec: 1786.34 - lr: 0.000024 - momentum: 0.000000 2023-10-13 21:18:52,599 epoch 6 - iter 1386/1984 - loss 0.03856198 - time (sec): 64.05 - samples/sec: 1797.77 - lr: 0.000024 - momentum: 0.000000 2023-10-13 21:19:01,520 epoch 6 - iter 1584/1984 - loss 0.03903961 - time (sec): 72.97 - samples/sec: 1801.11 - lr: 0.000023 - momentum: 0.000000 2023-10-13 21:19:10,502 epoch 6 - iter 1782/1984 - loss 0.03926235 - time (sec): 81.95 - samples/sec: 1798.82 - lr: 0.000023 - momentum: 0.000000 2023-10-13 21:19:19,517 epoch 6 - iter 1980/1984 - loss 0.04014236 - time (sec): 90.97 - samples/sec: 1799.22 - lr: 0.000022 - momentum: 0.000000 2023-10-13 21:19:19,698 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:19:19,698 EPOCH 6 done: loss 0.0401 - lr: 0.000022 2023-10-13 21:19:23,200 DEV : loss 0.18782225251197815 - f1-score (micro avg) 0.744 2023-10-13 21:19:23,224 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:19:32,531 epoch 7 - iter 198/1984 - loss 0.02824745 - time (sec): 9.31 - samples/sec: 1651.36 - lr: 0.000022 - momentum: 0.000000 2023-10-13 21:19:41,932 epoch 7 - iter 396/1984 - loss 0.02833595 - time (sec): 18.71 - samples/sec: 1721.59 - lr: 0.000021 - momentum: 0.000000 2023-10-13 21:19:51,512 epoch 7 - iter 594/1984 - loss 0.02650201 - time (sec): 28.29 - samples/sec: 1689.25 - lr: 0.000021 - momentum: 0.000000 2023-10-13 21:20:00,933 epoch 7 - iter 792/1984 - loss 0.02830249 - time (sec): 37.71 - samples/sec: 1696.26 - lr: 0.000020 - momentum: 0.000000 2023-10-13 21:20:09,940 epoch 7 - iter 990/1984 - loss 0.02699457 - time (sec): 46.71 - samples/sec: 1737.12 - lr: 0.000019 - momentum: 0.000000 2023-10-13 21:20:19,637 epoch 7 - iter 1188/1984 - loss 0.02713766 - time (sec): 56.41 - samples/sec: 1745.60 - lr: 0.000019 - momentum: 0.000000 2023-10-13 21:20:29,015 epoch 7 - iter 1386/1984 - loss 0.02807107 - time (sec): 65.79 - samples/sec: 1727.94 - lr: 0.000018 - momentum: 0.000000 2023-10-13 21:20:38,025 epoch 7 - iter 1584/1984 - loss 0.02793388 - time (sec): 74.80 - samples/sec: 1747.11 - lr: 0.000018 - momentum: 0.000000 2023-10-13 21:20:47,096 epoch 7 - iter 1782/1984 - loss 0.02878022 - time (sec): 83.87 - samples/sec: 1753.11 - lr: 0.000017 - momentum: 0.000000 2023-10-13 21:20:55,978 epoch 7 - iter 1980/1984 - loss 0.02830615 - time (sec): 92.75 - samples/sec: 1763.80 - lr: 0.000017 - momentum: 0.000000 2023-10-13 21:20:56,162 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:20:56,162 EPOCH 7 done: loss 0.0282 - lr: 0.000017 2023-10-13 21:21:00,009 DEV : loss 0.19144612550735474 - f1-score (micro avg) 0.763 2023-10-13 21:21:00,029 saving best model 2023-10-13 21:21:00,591 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:21:09,922 epoch 8 - iter 198/1984 - loss 0.01890035 - time (sec): 9.33 - samples/sec: 1751.85 - lr: 0.000016 - momentum: 0.000000 2023-10-13 21:21:19,057 epoch 8 - iter 396/1984 - loss 0.01767485 - time (sec): 18.46 - samples/sec: 1785.14 - lr: 0.000016 - momentum: 0.000000 2023-10-13 21:21:27,952 epoch 8 - iter 594/1984 - loss 0.01621751 - time (sec): 27.36 - samples/sec: 1785.22 - lr: 0.000015 - momentum: 0.000000 2023-10-13 21:21:36,767 epoch 8 - iter 792/1984 - loss 0.01561299 - time (sec): 36.17 - samples/sec: 1819.33 - lr: 0.000014 - momentum: 0.000000 2023-10-13 21:21:45,342 epoch 8 - iter 990/1984 - loss 0.01638503 - time (sec): 44.75 - samples/sec: 1829.74 - lr: 0.000014 - momentum: 0.000000 2023-10-13 21:21:53,833 epoch 8 - iter 1188/1984 - loss 0.01763387 - time (sec): 53.24 - samples/sec: 1827.41 - lr: 0.000013 - momentum: 0.000000 2023-10-13 21:22:02,464 epoch 8 - iter 1386/1984 - loss 0.01795526 - time (sec): 61.87 - samples/sec: 1842.78 - lr: 0.000013 - momentum: 0.000000 2023-10-13 21:22:11,253 epoch 8 - iter 1584/1984 - loss 0.01774552 - time (sec): 70.66 - samples/sec: 1862.31 - lr: 0.000012 - momentum: 0.000000 2023-10-13 21:22:20,490 epoch 8 - iter 1782/1984 - loss 0.01753201 - time (sec): 79.89 - samples/sec: 1845.71 - lr: 0.000012 - momentum: 0.000000 2023-10-13 21:22:29,666 epoch 8 - iter 1980/1984 - loss 0.01755699 - time (sec): 89.07 - samples/sec: 1838.26 - lr: 0.000011 - momentum: 0.000000 2023-10-13 21:22:29,843 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:22:29,843 EPOCH 8 done: loss 0.0175 - lr: 0.000011 2023-10-13 21:22:33,369 DEV : loss 0.2059580534696579 - f1-score (micro avg) 0.7711 2023-10-13 21:22:33,395 saving best model 2023-10-13 21:22:33,947 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:22:43,129 epoch 9 - iter 198/1984 - loss 0.01401159 - time (sec): 9.18 - samples/sec: 1824.43 - lr: 0.000011 - momentum: 0.000000 2023-10-13 21:22:52,178 epoch 9 - iter 396/1984 - loss 0.01102370 - time (sec): 18.23 - samples/sec: 1798.64 - lr: 0.000010 - momentum: 0.000000 2023-10-13 21:23:01,192 epoch 9 - iter 594/1984 - loss 0.01127156 - time (sec): 27.24 - samples/sec: 1785.58 - lr: 0.000009 - momentum: 0.000000 2023-10-13 21:23:10,348 epoch 9 - iter 792/1984 - loss 0.01140865 - time (sec): 36.40 - samples/sec: 1807.20 - lr: 0.000009 - momentum: 0.000000 2023-10-13 21:23:19,348 epoch 9 - iter 990/1984 - loss 0.01159091 - time (sec): 45.40 - samples/sec: 1817.85 - lr: 0.000008 - momentum: 0.000000 2023-10-13 21:23:28,275 epoch 9 - iter 1188/1984 - loss 0.01135423 - time (sec): 54.32 - samples/sec: 1805.71 - lr: 0.000008 - momentum: 0.000000 2023-10-13 21:23:37,355 epoch 9 - iter 1386/1984 - loss 0.01161526 - time (sec): 63.40 - samples/sec: 1800.54 - lr: 0.000007 - momentum: 0.000000 2023-10-13 21:23:46,647 epoch 9 - iter 1584/1984 - loss 0.01193952 - time (sec): 72.70 - samples/sec: 1802.01 - lr: 0.000007 - momentum: 0.000000 2023-10-13 21:23:56,078 epoch 9 - iter 1782/1984 - loss 0.01265391 - time (sec): 82.13 - samples/sec: 1800.11 - lr: 0.000006 - momentum: 0.000000 2023-10-13 21:24:05,126 epoch 9 - iter 1980/1984 - loss 0.01277214 - time (sec): 91.18 - samples/sec: 1794.10 - lr: 0.000006 - momentum: 0.000000 2023-10-13 21:24:05,313 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:24:05,313 EPOCH 9 done: loss 0.0127 - lr: 0.000006 2023-10-13 21:24:08,821 DEV : loss 0.2048816680908203 - f1-score (micro avg) 0.7736 2023-10-13 21:24:08,849 saving best model 2023-10-13 21:24:09,365 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:24:18,894 epoch 10 - iter 198/1984 - loss 0.01189187 - time (sec): 9.52 - samples/sec: 1789.41 - lr: 0.000005 - momentum: 0.000000 2023-10-13 21:24:28,197 epoch 10 - iter 396/1984 - loss 0.01127133 - time (sec): 18.83 - samples/sec: 1740.55 - lr: 0.000004 - momentum: 0.000000 2023-10-13 21:24:37,540 epoch 10 - iter 594/1984 - loss 0.01116046 - time (sec): 28.17 - samples/sec: 1732.33 - lr: 0.000004 - momentum: 0.000000 2023-10-13 21:24:46,777 epoch 10 - iter 792/1984 - loss 0.01063014 - time (sec): 37.41 - samples/sec: 1749.55 - lr: 0.000003 - momentum: 0.000000 2023-10-13 21:24:56,075 epoch 10 - iter 990/1984 - loss 0.01067083 - time (sec): 46.71 - samples/sec: 1744.53 - lr: 0.000003 - momentum: 0.000000 2023-10-13 21:25:05,141 epoch 10 - iter 1188/1984 - loss 0.00975194 - time (sec): 55.77 - samples/sec: 1756.59 - lr: 0.000002 - momentum: 0.000000 2023-10-13 21:25:14,121 epoch 10 - iter 1386/1984 - loss 0.00942246 - time (sec): 64.75 - samples/sec: 1765.55 - lr: 0.000002 - momentum: 0.000000 2023-10-13 21:25:23,112 epoch 10 - iter 1584/1984 - loss 0.00913427 - time (sec): 73.74 - samples/sec: 1775.53 - lr: 0.000001 - momentum: 0.000000 2023-10-13 21:25:32,155 epoch 10 - iter 1782/1984 - loss 0.00900070 - time (sec): 82.79 - samples/sec: 1787.74 - lr: 0.000001 - momentum: 0.000000 2023-10-13 21:25:41,053 epoch 10 - iter 1980/1984 - loss 0.00866710 - time (sec): 91.68 - samples/sec: 1784.89 - lr: 0.000000 - momentum: 0.000000 2023-10-13 21:25:41,229 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:25:41,229 EPOCH 10 done: loss 0.0086 - lr: 0.000000 2023-10-13 21:25:45,158 DEV : loss 0.21929599344730377 - f1-score (micro avg) 0.774 2023-10-13 21:25:45,188 saving best model 2023-10-13 21:25:46,137 ---------------------------------------------------------------------------------------------------- 2023-10-13 21:25:46,138 Loading model from best epoch ... 2023-10-13 21:25:47,640 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-13 21:25:50,920 Results: - F-score (micro) 0.7904 - F-score (macro) 0.7146 - Accuracy 0.6747 By class: precision recall f1-score support LOC 0.8228 0.8580 0.8401 655 PER 0.7312 0.8296 0.7773 223 ORG 0.5941 0.4724 0.5263 127 micro avg 0.7782 0.8030 0.7904 1005 macro avg 0.7160 0.7200 0.7146 1005 weighted avg 0.7736 0.8030 0.7865 1005 2023-10-13 21:25:50,920 ----------------------------------------------------------------------------------------------------